Neo4j: performance issue finding all paths between two nodes with CYPHER -


i'm using themoviedb database downloaded here. has ~60k nodes , ~100k relationships , need find paths of given length k between 2 nodes a , b given name property. let's need find path of lenght 2 between keanu reeves , laurence fishburne. used following cypher query:

match (k)-[e*2..2]-(l) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,e,l 

and took 40 seconds.

i decided try different approach , used following query instead:

match (k)--(m)--(l) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,m,l 

and took 252 milliseconds!

those 2 queries gave same results, had same meaning , yet first 1 took 200x more time. how possibile?

i need conduct tests in have find paths given maximum (but not minimum) length between 2 given nodes. gives me problems because cannot use second approach described (it works fixed lenght path) , first 1 waaaay slow.

i cannot use allshortestpath because doesn't return path length greater shorter one.

it's driving me crazy... idea how solve it?

edit

another example of how big issue is: finding path of lenght 4 between robert downey jr. , harrison ford. method #2: ~500 milliseconds method #1: >360 seconds (after 6 minutes brutally unplugged pc power adaptor)

the reason first query taking long because not using indexes @ all; scanning entire database.

if change query include actor label in path matching improve query performance.

match (k)-[e*2..2]-(l) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,e,l 

if reveal indexes executing :schema command in browser see indexes in place. can see first 1 on :actor(name); withing actor label name property indexed.

indexes   on :actor(name)    online                                on :director(name) online                                on :movie(title)   online                                on :person(name)   online                                on :user(login)    online (for uniqueness constraint)   constraints   on (user:user) assert user.login unique  

if profile query

profile match (k)-[e*2..2]-(l) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,e,l 

and profile 1 :actor label added abundantly clear why 2 perform differently.

profile match (k:actor)-[e*2..2]-(l:actor) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,e,l 

i forgot add should profile second ( faster ) query:

profile match (k)--(m)--(l) k.name = "keanu reeves" , l.name = "laurence fishburne" return k,m,l 

you see query plans different. think adding asterisk relationship sends database engine down different optimization path.

good luck!


Comments

Popular posts from this blog

shopping cart - Page redirect not working PHP -

php - How to modify a menu to show sub-menus -

python - Installing PyDev in eclipse is failed -