본문 바로가기
  • Let's go grab a data
Data/Python

Word Net 대응, synsets, synset, 거리측정

by pub-lican-ai 2018. 12. 12.
반응형

Word Net 대응하기


WordNet

  • 문자열이 같은데 여러 의미를 가질수 있을 때 상대해야함
  • Str -> Synset 여러의미

from nltk.corpus import wordnet

wordnet

#synsets(검색) -> 리스트

#synset(신셋색인)  .definition() .examples()  .lemmas()  .hyponyms()

wordnet.synsets('car')

[Synset('car.n.01'),
 Synset('car.n.02'),
 Synset('car.n.03'),
 Synset('car.n.04'),
 Synset('cable_car.n.01')]

wordnet.synset('car.n.01').definition()
wordnet.synset('car.n.01').examples()
wordnet.synset('car.n.01').lemmas()
[Lemma('car.n.01.car'),
 Lemma('car.n.01.auto'),
 Lemma('car.n.01.automobile'),
 Lemma('car.n.01.machine'),
 Lemma('car.n.01.motorcar')]


synset의 계층구조


motorcar = wordnet.synset('car.n.01')

#계층구조 하위어

motorcar.hyponyms()[:5]

#계층구조 상위어

motorcar.hypernyms()

#상위 경로

motorcar.hypernym_paths()

[Synset('ambulance.n.01'),
 Synset('beach_wagon.n.01'),
 Synset('bus.n.04'),
 Synset('cab.n.03'),
 Synset('compact.n.03')]

[Synset('motor_vehicle.n.01')]

[[Synset('entity.n.01'),
  Synset('physical_entity.n.01'),
  Synset('object.n.01'),
  Synset('whole.n.02'),
  Synset('artifact.n.01'),
  Synset('instrumentality.n.03'),
  Synset('container.n.01'),
  Synset('wheeled_vehicle.n.01'),
  Synset('self-propelled_vehicle.n.01'),
  Synset('motor_vehicle.n.01'),
  Synset('car.n.01')],
 [Synset('entity.n.01'),
  Synset('physical_entity.n.01'),
  Synset('object.n.01'),
  Synset('whole.n.02'),
  Synset('artifact.n.01'),
  Synset('instrumentality.n.03'),
  Synset('conveyance.n.03'),
  Synset('vehicle.n.01'),
  Synset('wheeled_vehicle.n.01'),
  Synset('self-propelled_vehicle.n.01'),
  Synset('motor_vehicle.n.01'),
  Synset('car.n.01')]]

wordnet.synsets('tree')

tree = wordnet.synset('tree.n.01')

#전체어

tree.member_holonyms()

[Synset('tree.n.01'),
 Synset('tree.n.02'),
 Synset('tree.n.03'),
 Synset('corner.v.02'),
 Synset('tree.v.02'),
 Synset('tree.v.03'),
 Synset('tree.v.04')]

[Synset('forest.n.01')]


관계를 통한 거리 측정


dog = wordnet.synset('dog.n.01')

cat = wordnet.synset('cat.n.01')
wolf = wordnet.synset('wolf.n.01')
def print_word_definition(word):
    for synset in wordnet.synsets(word):
        print(synset.name()+ ' : '+ synset.definition())
print_word_definition('wolf')
def print_words_synsets(word):
    print(wordnet.synsets(word))
wolf.n.01 : any of various predatory carnivorous canine mammals of North America and Eurasia that usually hunt in packs
wolf.n.02 : Austrian composer (1860-1903)
wolf.n.03 : German classical scholar who claimed that the Iliad and Odyssey were composed by several authors (1759-1824)
wolf.n.04 : a man who is aggressive in making amorous advances to women
beast.n.02 : a cruelly rapacious person
wolf.v.01 : eat hastily

print(dog.lowest_common_hypernyms(cat))
print(dog.lowest_common_hypernyms(wolf))
human = wordnet.synset('homo.n.02')
print(dog.lowest_common_hypernyms(human))
print(dog.path_similarity(wolf))
print(dog.path_similarity(cat))
print(dog.path_similarity(human))
[Synset('carnivore.n.01')]
[Synset('canine.n.02')]
[Synset('placental.n.01')]
0.3333333333333333
0.2
0.14285714285714285

한국어 워드넷

http://korlex.pusan.ac.kr/ 사용하려면 하고 굳이안해도 됨


반응형