igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] all_pairs shortest path


From: Tamás Nepusz
Subject: Re: [igraph] all_pairs shortest path
Date: Thu, 20 Dec 2012 17:07:45 +0100

> G = igraph.Graph(directed = True)
> 
> 1) Adding edges:
> G.add_edge([(1,2)])
> G.add_edge([(1,2)])
> if I add twice the same edges, my digraph keeps two occurence of the same 
> edges in G.get_edgelist: [(1, 2), (1, 2)]
Why is that a problem? If you add the edge twice, of course you are going to 
see it multiple times in the graph.

> How can i remove duplicates?
Use the simplify() method of the graph, this will remove multiple and/or loop 
edges.

> 2) Is there a way to compute once for all all shortest paths between all 
> pairs?
There is -- calculate them for each source vertex and then concatenate the 
resulting lists. :)
Okay, so the full story is as follows:

- If you need the _lengths_ of the shortest paths for each pair of vertices, 
you can use G.shortest_paths(), which returns a matrix. In your case, it's 
gonna be a 10K by 10K matrix, which is pretty big, but theoretically it's 
manageable (unless you want to print the resulting matrix from IPython because 
IPython will spend a lot of time with formatting the matrix). To be honest, I 
have never found a use-case where the full matrix was useful for me -- I 
usually calculate the shortest path from one vertex to all the others, and then 
move on to the next vertex, one by one.

- If you need _one_ shortest path between all pairs of vertices, use 
G.get_shortest_paths() and concatenate the resulting lists. Again, this is 
going to be a huge list if you have 10K vertices:

all_paths = sum((g.get_shortest_paths(i) for i in xrange(g.vcount())), [])

all_paths will then give you exactly one possible path for each pair, so you 
can simply find the path leading from vertex i to vertex j in entry (i*N+j) of 
the list, where N is the number of vertices.

- If you need _all_ shortest paths between all pairs of vertices, use 
G.get_all_shortest_paths(), similarly to the previous step. The only catch is 
that it is harder to find all the paths between vertex i and j in this huge 
list because one particular pair might occupy multiple positions in the list.

> 3) why is G.get_all_shortest_paths(1) does not work on the graph joined 
> (graph.edges)?
My guess is that it works but it takes a looooooooong time because there are 
too many shortest paths and get_all_shortest_paths() aims to return all of 
them. For instance, there are 1116 shortest paths between vertex 1 and vertex 
46 in your graph. This is not a problem on its own, but consider another vertex 
X for which vertex 46 lies on the shortest path from vertex 1 to X -- this 
means that vertex X will also be reachable from at least 1116 shortest paths 
(since you go first from vertex 1 to vertex 46 on any of the 1116 paths, then 
from vertex 46 to X). This can easily result in combinatorial explosion.

> Other remark maybe bug: 
> http://stackoverflow.com/questions/13974279/igraph-why-is-add-edge-function-so-slow-ompared-to-add-edges
This is not a bug; it works as intended. The reason is that igraph uses an 
indexed edge list as its data structure in the C layer. The index makes it 
possible to query the neighbors of a specific vertex in constant time. This is 
good if your graph rarely changes, but it becomes a burden when the 
modification operations are far more frequent than the queries, since whenever 
you add or remove an edge, you have to update the index. So, every call 
toadd_edge will make igraph reindex its internal data structures. The upside is 
that if you have to rebuild the index anyway, you can just as well add many 
edges using add_edges with approximately the same cost. So, in your first code 
example, you rebuild the index 30000 times, while in the second example, you 
rebuild the index only once.

Best,
Tamas


reply via email to

[Prev in Thread] Current Thread [Next in Thread]