As in previous posts, for those of you less familiar with the differences and similarities between RDF and the Property Graph, I recommend you watch this talk I gave at Graph Connect San Francisco in October 2016.
In the previous post on this series, I showed the most basic way in which a portion of your graph can be exposed as RDF. That was identifying a node by ID or URI if your data was imported from an RDF dataset. In this one, I’ll explore a more interesting way by running Cypher queries and serialising the resulting subgraph as RDF. Continue reading “Neo4j is your RDF store (part 2)”
Retail banking: Your graph-based fraud detection system powered by Neo4j is being used as part of the controls run when processing line of credit applications or when accounts are provisioned. It’s job is to block -or at least to flag- potentially fraudulent submissions as they come into your systems. It’s also sending alarms to fraud operations analysts whenever unusual patterns are detected in the graph so they can be individually investigated ASAP.
This is all working great but you want other analysts in your organisation to benefit from the super rich insights that your graph database can deliver, people whose job is not to react on the spot to individual fraud threats but rather understand the bigger picture. They are probably more strategic business analysts, maybe some data scientists doing predictive analysis too and they will typically want to look at fraud patterns globally rather than individually, combine the information in your fraud detection graph with other datasources (external to the graph) for reporting purposes, to get new insights, or even to ‘learn’ new patterns by running algorithms or applying ML techniques.
In this post I’ll describe through an example how Data Virtualization can be used to integrate your Neo4j graph with other data sources providing a single unified view easy to consume by standard analytical/BI tools. Continue reading “Graph DB + Data Virtualization = Live dashboard for fraud analysis”
If you want to understand the differences and similarities between RDF and the Labeled Property Graph implemented by Neo4j, I’d recommend you watch this talk I gave at Graph Connect San Francisco in October 2016.
Let me start with some basics: RDF is a standard for data exchange, but it does not impose any particular way of storing data.
What do I mean by that? I mean that data can be persisted in many ways: tables, documents, key-value pairs, property graphs, triple graphs… and still be published/exchanged as RDF. Continue reading “Neo4j is your RDF store (part 1)”
For this example I am going to use my browser history data. Most browsers store this data in SQLite. This means relational data, easy to access from Neo4j using the apoc.load.jdbc stored procedure. Continue reading “QuickGraph#4 Explore your browser history in Neo4j”
As far as I know, the only way to query Google’s Knowledge Graph currently is the search API. Let’s run a query on it, search for instance for Miles Davis’ album “Sketches of Spain”. Continue reading “The ‘hidden’ connections in Google’s Knowledge Graph”
For this example, I am going to use a sample movie dataset from the Cayley project. It’s a set of half a million triples about actors, directors and movies that can be downloaded here. Continue reading “QuickGraph#3 A step-by-step example of RDF to Property Graph transformation”
For this QuickGraph I’ll use data about Wikipedia Categories. You may have noticed at the bottom of every Wikipedia article a section listing the categories it’s classified under. Every Wikipedia article will have at least one category, and categories branch into subcategories forming overlapping trees. It is sometimes possible for a category (and the Wikipedia hierarchy is an example of this) to be a subcategory of more than one parent category, so the hierarchy is effectively a graph. Continue reading “QuickGraph#2 How is Wikipedia’s knowledge organised”
The first of a series of quick graphs in Neo4j built from public data. Watch this space! I’ll analyse a dataset on European politics by building a graph and querying across a number of dimensions. Continue reading “QuickGraph #1 European Politics from DBpedia. Loading data from an RDF triple store into Neo4j via SPARQL”
The previous blog post might have been a bit too dense to start with, so I’ll try something a bit lighter this time like importing RDF data into Neo4j. It assumes, however, a certain degree of familiarity with both RDF and graph databases. Continue reading “Importing RDF data into Neo4j”
There are two key characteristics of RDF stores (aka triple stores): the first and by far the most relevant is that they represent, store and query data as a graph. The second is that they are semantic, which is a rather pompous way of saying that they can store not only data but also explicit descriptions of the meaning of that data. The RDF and linked data community often refer to these explicit descriptions as ontologies. In case you’re not familiar with the concept, an ontology is a machine-readable description of a domain that typically includes a vocabulary of terms and some specification of how these terms inter-relate, imposing a structure on the data for such domain. This is also known as a schema. In this post, both terms schema and ontology will be used interchangeably to refer to these explicitly described semantics. Continue reading “Building a semantic graph in Neo4j”