Unlike in many other languages, there are no articles in Czech. Articles in foreign phrases are annotated as adjectives.
In some languages, articles distinguish gender, number and case. Analogically to Czech, their lemma should reflect the masculine singular nominative form, the morphological tag should encode the real word form in the text. However, sometimes this approach is not possible due to a different gender or number in Czech: La Manche is feminine in French, masculine inanimate in Czech; Los Angeles is plural in Spanish, singular in Czech (and in English). There has to be a special lemma for each such frozen article. Thus, los would be annotated el-3_,t_^(šp._člen) / AAMSX----1A----
in "do Prahy přijeli Los Paraguayos" but los-3_,t_^(šp._člen) / AAXXX----1A----
in "pracuje v Los Angeles".
The separate lemma reflects the fact that the word form is frozen since it was ported to other languages. However, it might not be needed. Articles are annotated as adjectives and adjectives (unlike nouns) are not required to stick with one gender.
Articles merged with a preposition (e.g. French du, Italian della, German aufs, beim, vom, zur, im, am...) are treated as prepositions.
Table 6.2. Articles in common foreign languages
Language |
Form |
Lemma |
Tag |
---|---|---|---|
English |
the |
|
|
English |
a |
|
|
English |
an |
|
|
German |
der |
|
|
German |
die |
|
|
German |
das |
|
|
German |
des |
|
|
German |
dem |
|
|
German |
den |
|
|
Dutch |
de |
|
|
Dutch |
het |
|
|
Dutch |
den |
|
|
French |
le |
|
|
French |
la |
|
|
French |
l |
|
|
French |
les |
|
|
Italian |
il |
|
|
Italian |
la |
|
|
Italian |
gli |
|
|
Italian |
le |
|
|
Spanish |
el |
|
|
Spanish |
la |
|
|
Spanish |
los |
|
|
Spanish |
las |
|
|
Portuguese |
o |
|
|
Portuguese |
a |
|
|
Portuguese |
os |
|
|
Portuguese |
as |
|
|
Arabic |
al, ad, an, ar, as, az |
|
|
Arabic |
el, ed, en, er, es, ez |
|
|
Hebrew |
ha |
|
|