Data tools like ElasticSearch or MongoDB have pretty elegant ways to do simple location, or geospatial, searches, e.g. show me a list of “X” things within “Y” distance of me.

However, if you use those tools it’s going to be much harder to connect points to varied kinds of related data. Simple searches using Elastic Search, for example, are tied to a single object type. Graphs make it easy or at least much easier to connect MULTIPLE types of objects to a geospatial point and make updates on all of those objects easier as well.

If you’re using Neo4j and the spatial plugin (or you want to), then I’ve got a simple way to add US-based zip codes as nodes for you to start using TODAY. Seriously, this will save you time if you're doing any type of work around US based searches on a map.


Consider the following data questions:

  • Who do I know in the network that’s within 1 mile of me?
  • Which stores have X within 5 miles of me?
  • Which customers bought X and also live within 100 meters of Y?

If your “things” have a point (or points) on the earth, then you can answer those questions. But how?

You can do it with MongoDB or Elastic Search

However, you'll need to keep up with the geometry for each object or doc type that uses a point. Also, if you want to relate any of those objects or doc types with another object or doc type (or have multiple relationships) then you will have to maintain references that point around or do it somehow in your app code.

In a graph database like Neo4j, you just create a relationship from a node or nodes to the node that has the point data. Now, you can haz a single node type that keeps up with how EVERYTHING is connected spatially!

So, if you need to connect multiple things to, for example, a zip code, then it makes a choosing a graph like Neo4j an easy decision.

How does it work in Neo4j? I am glad you asked!

What I used:

  • Neo4j 3.1. (DISCLOSURE: WE PROVIDE A SUBSCRIPTION SERVICE FOR ENTERPRISE NEO4J)

  • The spatial plugin (you just drop the jar in your "plugin" folder and DISCLOSURE: WE INCLUDE THE SPATIAL PLUGIN WITH THE SUBSCRIPTION SERVICE FOR ENTERPRISE NEO4J REFERENCED IN THE DISCLOSURE ABOVE)

While, yes, we do provide the subscription service for Enterprise Neo4j, I recommend you dev with a local Neo4j instance to get started. Our service is great, but you don't need it now. Or do you? Your call.


3 Steps to Location Goodness with a Graph

So, once you've got an instance of Neo4j and the spatial plugin added, follow these extremely tedious, six lines of code.

1. Setup a point layer

Two ways, but let's just use the easiest. Run the following Cypher command and procedure.

CALL spatial.addPointLayer('geom')


2. Add some nodes that have a "latitude" and "longitude" property and use a float as the type

Don't have those things you say? Got you covered. Just use this:

LOAD CSV WITH HEADERS FROM "https://s3.amazonaws.com/gs-demo-applications/location-zip/us_zips.csv" AS row
CREATE (:Location { locationId: row.locationId, city: row.city, county: row.county, usState: row.usState, zip: row.zip, latitude: toFloat(row.lat), longitude: toFloat(row.lon)})


This will add 41,160 Location nodes to your graph (thanks simplemaps.com for the zip code data!). Please go thank the folks at simplemaps.com, too. You can get a 3 year update on zip codes for less than the price of a mediocre Android phone. Or at least attribute them in your app or website. FOR GOSH SAKES, IS THAT TOO MUCH TO ASK?

3. Index the locations

Here is the hard part! I am kidding - another four liner (six with comments). Seriously, this takes a few seconds to run.

// FOR neo4j-spatial-0.24.1-neo4j-3.1.1-server-plugin.jar
// also notice the Tone Loc reference
MATCH (l:Location) WITH collect(l) as locedafterdark
CALL spatial.addNodes('geom', locedafterdark) 
YIELD count
RETURN count



If you are using neo4j-spatial-0.23-neo4j-3.0.4-server-plugin.jar, use this instead. It'll take about 15 seconds to run. The differene is this version of spatial.addNodes() returns the nodes instead of a count.

// FOR neo4j-spatial-0.23-neo4j-3.0.4-server-plugin.jar
// also notice the Tone Loc reference
MATCH (l:Location) WITH collect(l) as locedafterdark
CALL spatial.addNodes('geom', locedafterdark) 
YIELD node
RETURN count(node) as count



How do I know if I am using neo4j-spatial-0.23-neo4j-3.0.4-server-plugin.jar?

run this:

CALL dbms.procedures() YIELD name, signature 
WHERE name Starts with "spatial.addNodes" 
RETURN name, signature



If your result looks like the result below, then you're likely using neo4j-spatial-0.23.




On step 3, there might be faster, more performant ways to add larger sets of geo data. However, they require APOC, and I didn't want to give you another jar to go and get and yet another thing with which to futz. I just like you too much. What I can say? I'm a giver.


Now, let's test it. Run some coordinates to check withinDistance like so (this is a Memphis, TN coordinate):

CALL spatial.withinDistance("geom",{latitude:35.1531,longitude:-90.0555},5.0) 
YIELD node as location
RETURN location


This should result in five nodes being returned. If so, that's it! You're all set for this part.

Next, now let's connect another node, like a Product node...

CREATE (n:Product {id:"1",t:"The product"}) RETURN n


Connect it to a location node...

MATCH (l:Location {zip:"38103"})
MATCH (p:Product {id:"1"})
CREATE (l)-[:HAS]->(p)


Then, search "withinDistance" of location and match those locations to a relationship

CALL spatial.withinDistance("geom",{latitude:35.15,longitude:-90.05},5.0) 
YIELD node as l 
MATCH (l)-[:HAS]->(p:Product {id:"1"})
RETURN p


This will result in the product node being returned. Alternatively, you could have a store or stores connected to the location, and then products in stock at the store, e.g.

MATCH (l)-[:CONTAINS]->(s:Store)-[:HAS]->(p:Product {id:"1"}) 

A.K.A. "Stores within 5 KM of X zip code that have Y product"


By following this, I just saved you countless hours of doing all this yourself. Do not tell anyone of your unparalleled efficiency and take the rest of the day to cruise...let's go with...thinkgeek.com