Helsinki Corpus of Swahili 2.0 (HCS 2.0)
General information
The Helsinki Corpus of Swahili 2.0 is available in the Language Bank of Finland (Kielipankki) as two versions: a Not Annotated Version and an Annotated Version.
The Not Annotated Version of the Helsinki Corpus of Swahili 2.0 is available for download. (More information in META-SHARE.)
The Annotated Version of Helsinki Corpus of Swahili 2.0 is available in the Korp concordance service at http://urn.fi/urn:nbn:fi:lb-201608301. (More information in META-SHARE.)
To use HCS 2.0 Annotated Version in Korp, you need to have an academic status in an institute belonging to the Haka or eduGAIN federation, or to apply for personal access rights. You also need to log in to Korp.
Moving from HCS in Lemmie to HCS 2.0 in Korp
If you have used the previous version of HCS in the Lemmie service or if you try to replicate the examples in the HCS instructions for Lemmie in Korp’s extended search, please see the instructions for moving from Lemmie to Korp. In particular, note the following differences:
- Annotation values in HCS 2.0 are in uppercase.
- Some morphological features are in attribute
msdextra
although most are in msd
. See the headings of the annotation tables below for the attribute names.
- The attribute values of the CQP expressions used in Korp are regular expressions, so you need to replace the truncation symbol “
*
” of Lemmie with “.*
“ (a full stop followed by an asterisk).
- Attribute constraints for a single token are separated by an ampersand (&) in CQP.
For example, the Lemmie query
[pos='v' msd='*caus*' msd='*appl*']
is converted to the following CQP query for Korp
[pos='V' & msdextra='.*CAUS.*' & msdextra='.*APPL.*']
Annotations in the annotated version of HCS 2.0
The following tables list the annotation features used in the annotated version of HCS 2.0 available in Korp.
Part-of-speech features (attribute pos
)
ABBR |
abbreviation |
ADJ |
adjective |
ADV |
adverb |
AG-PART |
agent particle |
CC |
coordinating conjunction |
CONJ |
conjunction |
CONJ/CC |
conjunction or coordinator |
DEM |
demonstrative pronoun |
EXCLAM |
exclamation |
GEN-CON |
genitive connector |
GEN-CON-KWA |
genitive connector, classes 15 and 17 |
INTERROG |
interrogative |
N |
noun |
NUM |
numeral |
NUM-ROM |
Roman numeral |
POSS-PRON |
possessive pronoun |
PREP |
preposition |
PREP/ADV |
preposition or adverb |
PRON |
pronoun |
PROPNAME |
proper name |
REL-LI |
relative construction with -li- marker |
REL-LI-VYO |
relative construction with -li- marker, referring to manner |
REL-SI |
relative negative construction with -si- marker |
REL-SI-VYO |
relative negative construction with -si- marker, referring to manner |
TITLE |
title |
V |
verb |
V-BE |
auxiliary verb |
V-DEF |
defective verb |
Punctuation marks and diacritics (attribute pos
)
COLON |
colon |
COMMA |
comma |
DOUBLE-QUOTE |
double quote |
DOUBLE-QUOTE-CLOSING |
closing double quote |
DOUBLE-QUOTE-OPENING |
opening double quote |
HYPHEN |
hyphen |
LEFT-PARENTHESIS |
left parenthesis |
PERCENT-MARK |
percent mark |
QUESTION-MARK |
question mark |
RIGHT-PARENTHESIS |
right parenthesis |
SEMI-COLON |
semicolon |
SINGLE-QUOTE |
single quote |
SINGLE-QUOTE-CLOSING |
closing single quote |
SINGLE-QUOTE-OPENING |
opening single quote |
SLASH |
slash |
STOP |
stop |
Features for nouns (attribute msd
)
1/2-PL |
plural of the noun class group 1/2 |
1/2-SG |
singular of the noun class group 1/2 |
10-PL |
noun class 10 plural |
11-SG |
noun class 11 singular |
11/10-PL |
plural of the noun class group 11/10 |
11/10-SG |
singular of the noun class group 11/10 |
11/6-PL |
plural of the noun class group 11/6 |
11/6-SG |
singular of the noun class group 11/6 |
15-SG |
noun class 15 singular |
16-SG |
noun class 16 singular |
17-SG |
noun class 17 singular |
18-SG |
noun class 18 singular |
3/4-PL |
plural of noun class group 3/4 |
3/4-SG |
singular of noun class group 3/4 |
5/6-PL |
plural of noun class group 5/6 |
5/6-SG |
singular of noun class group 5/6 |
6-PLSG |
noun class 6 plural with singular meaning, e.g. ’maji’ |
7/8-PL |
plural of noun class group 7/8 |
7/8-SG |
singular of noun class group 7/8 |
9/10-PL |
plural of noun class group 9/10 |
9/10-SG |
singular of noun class group 9/10 |
9/6-PL |
plural of noun class group 9/6 |
9/6-SG |
singular of noun class group 9/6 |
Single noun class features (attribute msd
)
1-SG |
noun class 1, singular |
1-SG1 |
first person singular |
1-SG3 |
third person plural |
2-PL |
noun class 2 plural |
2-PL1 |
first person plural |
3-SG |
noun class 3 singular |
4-PL |
noun class 4 plural |
5-SG |
noun class 5 singular |
6-PL |
noun class 6 plural |
7-SG |
noun class 7 singular |
8-PL |
noun class 8 plural |
9-SG |
noun class 9 singular |
PL1 |
first person plural |
PL2 |
second person plural |
PL3 |
third person plural |
SG |
singular |
SG1 |
first person singular |
SG2 |
second person singular |
SG3 |
third person singular |
LOC |
locative |
LOC-16 |
locative class 16 |
LOC-17 |
locative class 17 |
LOC-18 |
locative class 18 |
Features for adjectives and numerals
A-INFL |
inflecting adjective |
A-UNINFL |
noninflecting adjective |
AD-ADJ |
adverb modifying adjective |
ADJ-POST |
postmodifying adjective |
ADJ-PR-REL |
adjective constructed with present tense relative |
ADJ-PRE |
premodifying adjective |
ADJ-REL |
adjective constructed with relative verb structure |
ADJ-REL-NEG |
adjective constructed with negative present tense relative |
CARD |
cardinal number |
COMP |
comparative |
NUM-INFL |
inflecting numeral |
NUM-ROM |
Roman numeral |
NUM-UNINFL |
noninflecting numeral |
ORD |
ordinary number |
OTE |
inflection class ’OTE’ |
SUPER |
superlative |
Features for verbs: Subject prefix (attribute msd
)
9-SG-SP |
subject prefix of noun class 9 singular |
4-PL-SP |
subject prefix of noun class 4 plural |
SUB-PREF=1-SG |
subject prefix of class 1 singular |
SUB-PREF=1-SG1 |
subject prefix of first person singular |
SUB-PREF=1-SG2 |
subject prefix of second person singular |
SUB-PREF=1-SG3 |
subject prefix of third person singular |
SUB-PREF=10-PL |
subject prefix of class 10 plural |
SUB-PREF=11-SG |
subject prefix of class 11 singular |
SUB-PREF=15-SG |
subject prefix of class 15 singular |
SUB-PREF=16-SG |
subject prefix of class 16 singular |
SUB-PREF=17-SG |
subject prefix of class 17 singular |
SUB-PREF=18-SG |
subject prefix of class 18 singular |
SUB-PREF=2-PL |
subject prefix of class 2 plural |
SUB-PREF=2-PL1 |
subject prefix of first person plural |
SUB-PREF=2-PL2 |
subject prefix of second person plural |
SUB-PREF=2-PL3 |
subject prefix of third person plural |
SUB-PREF=3-SG |
subject prefix of class 3 singular |
SUB-PREF=4-PL |
subject prefix of class 4 plural |
SUB-PREF=5-SG |
subject prefix of class 5 singular |
SUB-PREF=6-PL |
subject prefix of class 6 plural |
SUB-PREF=6-PLSG |
subject prefix of class 6 plural with singular meaning |
SUB-PREF=7-SG |
subject prefix of class 7 singular |
SUB-PREF=8-PL |
subject prefix of class 8 plural |
SUB-PREF=9-SG |
subject prefix of class 9 singular |
SUB-PREF=HABIT-PL |
subject prefix of habitual verb form plural |
SUB-PREF=HABIT-SG |
subject prefix of habitual verb form singular |
SUB-PREF=PL1 |
subject prefix of first person plural |
SUB-PREF=PL2 |
subject prefix of second person plural |
SUB-PREF=SG1 |
subject prefix of first person singular |
SUB-PREF=SG2 |
subject prefix of second person singular |
Features for verbs: TAM markers (attribute msd
)
TAM=COND-NEG:singe |
conditional present negative, marker -singe- |
TAM=COND:ki |
conditional present, marker -ki- |
TAM=COND:nge |
conditional present, marker -nge- |
TAM=COND:ngeli |
conditional past, marker -ngeli- |
TAM=FUT |
future |
TAM=FUT:ta |
future, marker -ta- |
TAM=FUT:taka |
future, marker -taka- |
TAM=FUT:to |
future, marker -to- |
TAM=NARR:ka |
narrative, marker -ka- |
TAM=NEG-a |
present negative |
TAM=PAST |
simple past |
TAM=PAST-NEG:ku |
past negative, marker -ku- |
TAM=PAST-NEG:kw |
past negative, marker -kw- |
TAM=PAST:liisha |
past, marker -liisha- |
TAM=PAST:likwisha |
past, marker -likwisha- |
TAM=PAST:lisha |
past, marker -lisha- |
TAM=PERF:me |
perfect, marker -me- |
TAM=PERF:meisha |
perfect, marker -meisha- |
TAM=PERF:mekwisha |
perfect, marker -mekwisha- |
TAM=PERF:mesha |
perfect, marker -mesha- |
TAM=PERF:sha |
perfect, marker -sha- |
TAM=PR:a |
present, marker -a- |
TAM=PR:na |
present, marker -na- |
TAM=SBJN |
subjunctive |
TAM=SBJN-CONS |
subjunctive consecutive |
IMP |
imperative |
IMP-PL2 |
imperative of second person plural |
Features for verbs: Relative prefix (attribute msd
)
REL-PREF=1-SG-SUB |
relative prefix referring to subject of class 1 singular |
REL-PREF=10-PL |
relative prefix referring to class 10 plural |
REL-PREF=11-SG |
relative prefix referring to class 11 singular |
REL-PREF=15-SG |
relative prefix referring to class 15 singular |
REL-PREF=16-SG |
relative prefix referring to class 16 singular |
REL-PREF=17-SG |
relative prefix referring to class 17 singular |
REL-PREF=18-SG |
relative prefix referring to class 18 singular |
REL-PREF=2-PL-SUB |
relative prefix referring to subject of class 2 plural |
REL-PREF=3-SG |
relative prefix referring to class 3 singular |
REL-PREF=4-PL |
relative prefix referring to class 4 plural |
REL-PREF=5-SG |
relative prefix referring to class 5 singular |
REL-PREF=6-PL |
relative prefix referring to class 6 plural |
REL-PREF=6-PLSG |
relative prefix referring to class 6 plural with singular meaning |
REL-PREF=7-SG |
relative prefix referring to class 7 singular |
REL-PREF=8-PL |
relative prefix referring to class 8 plural |
REL-PREF=9-SG |
relative prefix referring to class 9 singular |
1-SG-OBJ-REL |
noun class 1, relative, referring to object |
10-PL-REL |
relative prefix referring to noun class 10 plural |
16-SG-REL |
relative prefix referring to noun class 16 singular |
2-PL-OBJ-REL |
relative prefix referring to object |
3-SG-REL |
relative prefix referring to noun class 3 singular |
8-PL-REL |
relative prefix referring to noun class 8 plural |
Features for verbs: Object prefix (attribute msd
)
OBJ-PREF=1-SG1 |
object prefix referring to first person singular |
OBJ-PREF=1-SG2 |
object prefix referring to second person singular |
OBJ-PREF=1-SG3 |
object prefix referring to third person singular |
OBJ-PREF=10-PL |
object prefix referring to class 10 plural |
OBJ-PREF=11-SG |
object prefix referring to class 11 singular |
OBJ-PREF=15-SG |
object prefix referring to class 15 singular |
OBJ-PREF=16-SG |
object prefix referring to class 16 singular |
OBJ-PREF=2-PL1 |
object prefix referring to first person plural |
OBJ-PREF=2-PL2 |
object prefix referring to second person plural |
OBJ-PREF=2-PL3 |
object prefix referring to third person plural |
OBJ-PREF=3-SG |
object prefix referring to class 3 singular |
OBJ-PREF=4-PL |
object prefix referring to class 4 plural |
OBJ-PREF=5-SG |
object prefix referring to class 5 singular |
OBJ-PREF=6-PL |
object prefix referring to class 6 plural |
OBJ-PREF=7-SG |
object prefix referring to class 7 singular |
OBJ-PREF=8-PL |
object prefix referring to class 8 plural |
OBJ-PREF=9-SG |
object prefix referring to class 9 singular |
OBJ-PREF=PL-REFL |
reflexive object prefix -ji- referring to plural |
OBJ-PREF=SG-REFL |
reflexive object prefix -ji- referring to singular |
1-SG2-OBJ |
object prefix referring to second person singular |
1-SG3-OBJ |
object prefix referring to third person singular |
10-PL-OBJ |
object prefix referring to noun class 10 plural |
15-SG-OBJ |
object prefix referring to noun class 15 singular |
16-SG-OBJ |
object prefix referring to noun class 16 singular |
17-SG-OBJ |
object prefix referring to noun class 17 singular |
2-PL1-OBJ |
object prefix referring to first person plural |
2-PL3-OBJ |
object prefix referring to third person plural |
7-SG-OBJ |
object prefix referring to class 7 singular |
9-SG-OBJ |
object prefix referring to class 9 singular |
SG-REFL-OBJ |
reflexive prefix referring to singular |
PL-REFL-OBJ |
reflexive object prefix -ji- referring to plural |
APPL |
applicative |
CAUS |
causative |
CS |
subordinating conjunction |
PASS |
passive |
PS |
passive |
REC |
reciprocal |
REDUPL |
reduplication |
STAT |
stative |
AN-S |
verb requiring animate subject |
HUM-S |
verb requiring human subject |
AUX-WA |
auxiliary verb |
COMPL |
completed action |
COND-IF |
conditional verb form with the marker -if- |
EMPH |
emphasis |
HUM-ACT |
verb expressing human action |
HUM-S |
verb requiring human subject |
INFMARK |
infinitive marker |
MONOSLB |
monosyllabic verb |
NEG |
negative |
NO-IN |
prepositon ’in’ not required |
NO-TO |
prepositon ’to’ not required |
NOSUBJ |
no subject required |
OBJ |
object prefix |
REL-LI |
relative verb construction with the tense marker -li- |
REL-LI-VYO |
relative verb construction with the marker -vyo- |
REL-SI |
relative prefix, marker -si- |
SV |
intransitive verb |
SVO |
monotransitive verb |
SVOO |
ditransitive verb |
VFIN |
finite verb |
INF |
infinitive |
INF-NEG |
negative infinitive |
**CLB |
clause boundary |
<LOC |
locative on the left |
AR |
word of Arabic origin |
CAP |
capital initial |
CC-PL |
associotional copula attached to plural, e.g. ’nao’ |
CC-SG |
associotional copula attached to singular, e.g. ’naye’ |
DIM |
diminutive |
DOM-AN |
domestic animal |
FEM |
female |
HC |
health care domain |
HUM |
human |
IDIOM |
idiom |
MALE |
male |
MASS |
mass |
MWE |
multi-word expression |
NA-POSS |
possessive pronoun ’na’ |
NO-GLOSS |
no gloss required |
NOART |
no article required |
NOGLOSS |
no gloss required |
NON-STD |
non-standard form |
NOUN+POSSESSIVE |
noun and possessive combined |
NOVERB |
no verb required |
PERS |
Persian origin |
PLACE |
place |
POR |
portuguese origin |
POSS |
possessive |
PREFR |
preferred reading |
PROP-CAND |
candidate for proper name |
PROVERB |
proverb |
TIME |
time |
WEEK |
week |
Syntactic tags (attribute syntax
)
@-FAUXV |
non-finite auxiliary predicator |
@-FMAINV |
non-finite main predicator |
@-FMAINV-n |
non-finite main predicator |
@-FMAINVkwisha< |
non-finite main predicator, referring to completed action |
@<AD-A |
postmodifying ad-adjective |
@<DN |
determiner, noun on the left |
@<NADJ |
adjective qualifier, noun on the left |
@<NDEM |
denonstrative, noun on the left |
@<NH |
postmodifying noun, head on the left |
@<P |
other postmodifier |
@<QN |
postmodifying quantifier |
@A> |
appositional premodifier |
@AD-A> |
premodifying ad-adjective |
@ADVL |
adverbial |
@AG |
agentive adverbial |
@CC |
coordinator |
@CS |
subordinator |
@DN> |
determiner, noun on the right |
@FAUXV |
finite auxiliary predicator |
@FMAINV |
finite main predicator |
@FMAINVintr |
finite main predicator, intransitive |
@FMAINVintr-ass |
defective main predicator, associated with subject |
@FMAINVintr-def |
defective main predicator, intransitive |
@FMAINVintr-loc |
defective main predicator, locative |
@FMAINVtr+OBJ> |
finite main predicator, transitive |
@FMAINVtr-OBJ> |
finite main predicator, intransitive |
@GCON |
genitive |
@I-OBJ |
indirect object |
@NADJ |
adjective |
@NADJ> |
postmodifying adjective |
@NDEM> |
demonstrative, noun on the right |
@NH |
noun head |
@OBJ |
object |
@P> |
other premodifier |
@PAT |
patient |
@PCOMPL-S |
subject complement |
@QN |
quantifier |
@SUBJ |
subject |
@SUBJ+rel |
referent of the following relative verb |