Parts of Speech and Number of Accents


I thought I'd write a quick Python script to check how many accents were on each of the lemmata in MorphGNT 5.06.

Here are the counts by part of speech and number of accents on lemma:

     0    1    2  
  A    -    9159    -  
  C    924    17361    -  
  D    1592    4606    -  
  I    -    17    -  
  N    30    28271    1  
  P    5433    5488    -  
  RA    19862    4    -  
  RD    -    1744    -  
  RI    -    1165    -  
  RP    -    11584    -  
  RR    -    1677    -  
  V    8    28101    1  
  X    147    844    -  

Some of the low numbers are definitely errors in the database. Now to investigate...

UPDATE (2005-07-16): both 2-accent cases were mistakes. The 30 0-accent nouns and 5 of the 0-accent verbs were foreign loan words that intentionally weren't accented but 3 of the 0-accent verbs were mistakes. The 4 accented articles were the result of crasis with the following noun and the word should probably be analyzed as a noun rather than an article. I guess there'll be a 5.07 release soon. NOTE: I haven't looked at the particles, adverbs, conjunctions or prepositions yet.

The original post was in the category: morphgnt but I'm still in the process of migrating categories over.