7

My question is related to WordNet Interface.

   >>> wn.synsets('cat')
       [Synset('cat.n.01'), Synset('guy.n.01'), Synset('cat.n.03'),
        Synset('kat.n.01'), Synset('cat-o'-nine-tails.n.01'), 
        Synset('caterpillar.n.02'), Synset('big_cat.n.01'), 
        Synset('computerized_tomography.n.01'), Synset('cat.v.01'), 
        Synset('vomit.v.01')]
    >>> 

I could not find the answer to what is the purpose of n and the following number in cat.n.01 or caterpillar.n.02.

1
  • Please edit the question to explain what “wordnet” means for Python. If “wordnet” is a library, make that term a link to the home page so we can know what you're referring to.
    – bignose
    Commented Jan 16, 2016 at 19:30

1 Answer 1

10

Per the NLTK docs, a <lemma>.<pos>.<number> Synset string is composed of the following parts:

  • <lemma> is the word’s morphological stem
  • <pos> is one of the module attributes ADJ, ADJ_SAT, ADV, NOUN or VERB
  • <number> is the sense number, counting from 0

Thus, the <pos> is the part of speech. According to the wordnet man page, the part of speech character has the following meaning:

n    NOUN
v    VERB
a    ADJECTIVE
s    ADJECTIVE SATELLITE
r    ADVERB 

The <number> is used to disambiguate word meanings.

Not the answer you're looking for? Browse other questions tagged or ask your own question.