QuickGraph#18 Semantic similarity metrics in taxonomies: A wikipedia example on uncrewed spacecraft

In this post i’ll give you an overview of some similarity metrics I’ve discovered when working with WordNet. Even though they were originally proposed as linguistic similarity metrics, I thought it would make sense to explore their behaviour if we generalise their use to a taxonomy-annotated dataset.

I will use public data from Wikipedia and what topic to choose on the week that Percy landed in Mars? No other than the rich domain of uncrewed spacecraft. Follow me!

Continue reading →

QuickGraph#17 The English WordNet in Neo4j (part 2)

In this second post on WordNet on Neo4j I will be focusing on querying and analysing the graph that we created in the previous post. I’ll leave for a third instalment some more advanced analysis and maybe integrations with NLTK or RDF.

Remember that you can test all the examples in this post directly on the demo server. The access credentials are wordnet/wordnet (also you’ll need to select the database of the same name). I’ve also put the queries in a Colab python notebook if you prefer to run them from there.

Let’s crack on.

Continue reading →

QuickGraph#16 The English WordNet in Neo4j (part 1)

English WordNet is a representation of the English language a lexical network. It groups words into synsets and links them according to semantic relationships such as hypernymy, antonymy and meronymy. You can actually browse through its content from the English Wordnet website. Wordnet is often used in natural language processing (NLP) applications (but also many others) and provides deep lexical information about the English language as a graph. As a graph… that sounds interesting, definitely worth a QuickGraph.

Because this is a particularly rich case I’ll break it down in at least two instalments. In the first one I’ll explain the construction of the graph in Neo4j and in the second one I’ll show some interesting ways of using it. I hope you’ll enjoy it.

Continue reading →