Tibetan part-of-speech tagset

A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Tibetan part-of-speech tagset is available in Tibetan corpora annotated with a Rule-based Part-of-speech Tagger for Classical Tibetan developed by a research project ‘Tibetan in Digital Communication’ hosted at SOAS, University of London.

Tibetan corpora

in Sketch Engine

An Example of a tag in the CQL concordance search box: [tag="n.prop"] finds all proper nouns, e.g. བོད་, རབ་འབྱོར་(note: please make sure that you use straight double quotation marks)

Basic part-of-speech tagset

POS categories	POS tag
Adjectives	adj
Adverbs	adv..*
Case markers	case..*
Clitics	cl..*
Converbs	cv..*
Demonstratives, determiners, etc.	d..*
Nouns	n..*
Negation	neg
Numbers	num..*
Pronouns	p..*
Verbs (and verbal nouns)	v..* (n.v..*)

Detailed POS tagset

POS tag	Description
adj	adjective
adv.dir	directional adverb
adv.intense	intensive adverb
adv.mim	mimetic adverb
adv.proclausal	proclausal adverb
adv.temp	temporal adverb
case.abl	ablative (affix -las after a noun phrase)
case.agn	agentive (affixes -kyis, -gyis, -gis, -yis, -s)
case.all	allative (affix -la after a noun phrase)
case.ass	associative (affix -daṅ after a noun phrase)
case.comp	comparative (affixes -bas and -pas after a noun phrase)
case.ela	ellative (affix -las after a noun phrase)
case.gen	genitive (affixes -kyi, -gyi, -gi, -yi, -ḥi)
case.loc	locative (affix -na after a noun phrase)
case.nare	quotative (affixes -na, -re)
case.term	terminative (affixes -du, -tu, -su, -ru, -r)
cl.lta	clitic lta in the combinations lta ste and na lta
cl.tsam	the clitics -tsam
cl.focus	the focus clitics ni
cl.quot	the quotative clitics ces
cv.abl	affix -las after a verb stem
cv.agn	affixes -gis
cv.all	affix -la after a verb stem
cv.are	affix -ta-re and its allomorphs after a verb stem
cv.ass	affix -da? after a verb stem
cv.ela	affix -las after a verb stem
cv.fin	affixes -to
cv.gen	affixes -gi
cv.imp	affixes -cig
cv.impf	affixes -ci?
cv.loc	affix -na after a verb stem
cv.ques	affixes -tam and its allomorphs.
cv.sem	affixes -te
cv.term	affixes -tu
d.dem	demonstratives
d.det	determiners
d.emph	emphatics
d.indef	indefinites
d.plural	plurals
d.tsam	tsam
dunno	a word that we have not been able to analyze
interj	interjection
n..*	noun
n.count	lexical nouns
n.mass	mass nouns
n.prop	proper nouns
n.rel	relator nouns
n.v.aux	auxiliary verbal noun
n.v.cop	copula verbal noun
n.v.fut	future verbal noun
n.v.fut.n.v.past	future/past verbal noun
n.v.fut.n.v.pres	future/present verbal noun
n.v.imp	imperative verbal noun
n.v.invar	invariable verbal noun
n.v.neg	negative verbal noun
n.v.past	past verbal noun
n.v.past.n.v.pres	past/present verbal noun
n.v.pres	present verbal noun
neg	two negation prefixes ma and mi
num.*	numeral
num.card	cardinal number
num.ord	ordinal number
numeral	numeral
p.indef	indefinite pronouns
p.interrog	interrogative pronouns
p.pers	personal pronouns
p.refl	personal reflexive
punc	punctuation mark
sent	end of sentence punctuation
skt
v.aux	auxiliary verbs
v.cop	copula verbs
v.cop.neg	negative copula verb
v.fut	future verb stem
v.fut.v.past	future/past verb stem
v.fut.v.pres	future/present verb stem
v.imp	imperative verb stem
v.invar	invariable verb stem
v.neg	the inherently negative verb med
v.past	past verb stem
v.past.v.pres	past/present verb stem
v.pres	present verb stem

Note: word forms with and without tsheg (e.g. ཐོག་ and ཐོག) are separate lexical entries, but they are both normalized to the same form in attribute “notsheg”.

Source

http://larkpie.net/tibetancorpus/ https://soas-repository.worktribe.com/output/420898

Reference

Garrett, Edward and Hill, Nathan W. and Zadoks, Abel (2014) ‘A Rule-based Part-of-speech Tagger for Classical Tibetan.’ Himalayan Linguistics, 13 (1). pp. 9-57. (CC BY-NC-ND 4.0)

Detailed POS tagset

Source

Reference

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine