Generating a Word2Vec model from a block of Text using Gensim (Python)

Short tutorial on NLP & ML

Sachin Joglekar's blog

screen-shot-2015-04-10-at-4-16-00-pm

Word2Vec is a semantic learning framework that uses a shallow neural network to learn the representations of words/phrases in a particular text. Simply put, its an algorithm that takes in all the terms (with repetitions) in a particular document, divided into sentences, and outputs a vectorial form of each. The ‘advantage’ word2vec offers is in its utilization of a neural model in understanding the semantic meaning behind those terms. For example, a document may employ the words ‘dog’ and ‘canine’ to mean the same thing, but never use them together in a sentence. Ideally, Word2Vec would be able to learn the context and place them together in its semantic space. Most applications of Word2Vec using cosine similarity to quantify closeness. This Quora question (or rather its answers) does a good job of explaining the intuition behind it.

You would need to take the following steps to develop a Word2Vec model…

View original post 730 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s