Richard works as a search and data scientist. He enjoys sifting through intricacies of algorithms and models, analysing data, refactoring code until its understandable and elegant, reading in on new domains and problems, and designing practical solutions.

A look at the source code of gensim doc2vec

Previously, we’ve built a simple PV-DBOW-‘like’ model (https://amsterdam.luminis.eu/2017/02/21/coding-doc2vec/). We’ve made a couple of choices, e.g., about how to generate training batches, how to compute the loss function, etc. In this blog post, we’ll take a look at the choices made...

Coding doc2vec

In two previous posts, we googled doc2vec [1] and “implemented” [2] a simple version of a doc2vec algorithm. You could say we gave the specifications for a doc2vec algorithm. But we did not actually write any code. In this post,...

Implementing doc2vec

In a previous post [1], we’ve taken a look at what doc2vec is, who invented it, what it does, what implementations exist, and what has been written about it. In this post, we will work out the details of a...

Googling doc2vec

On this site, recently, we featured a blog post [12] that used Doc2vec [4]. What is Doc2vec? Where does it come from? What does it do? Why use doc2vec, instead of other algorithms that do the same? What implementations exist? Where...

