.
Import the required libraries:
import gensimfrom gensim.models import word2vecfrom gensim.models import KeyedVectorsfrom sklearn.metrics.pairwise import cosine_similarity
.
Upload Word2Vec emoji2vec dataset into /content folder:
.
Alternatively, get from Archive.org:
!wget 'https://archive.org/download/word-embeddings/emoji2vec.bin' -P '/content'
.
Load Emoji2Vec model
import gensim.models as gsme2v = gensim.models.KeyedVectors.load_word2vec_format('/content/emoji2vec.bin', binary=True)happy_vector = e2v['😂'] # Produces an embedding vector of length 300print(happy_vector.shape)
>>>>(300,)
.
Find cosine similarity:
v_king = e2v["🤴"]v_queen = e2v["👸"]print(v_king.shape)print(v_queen.shape)cosine_similarity([v_king],[v_queen])
>>>>(300,)
(300,)
array([[0.48802766]], dtype=float32)
.
Reference:
https://github.com/uclnlp/emoji2vec
No comments:
Post a Comment