igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Python - Creating new vertices and edges


From: Yazan Boshmaf
Subject: Re: [igraph] Python - Creating new vertices and edges
Date: Thu, 27 Feb 2014 14:10:24 -0800

Thanks, Tamás, for your answer. It's
much
clearer now
.


On Thu, Feb 27, 2014 at 1:51 AM, Tamás Nepusz <address@hidden> wrote:
Hi,

> I was wondering what's the rationale behind not returning a reference object to newly created vertices and edges in Python (or at least an index)?
References to newly created vertices and edges are not returned because there is actually no new vertex/edge object created when you do the addition. The core of igraph is written in C, so adding a vertex involves increasing a variable that stores the number of vertices, plus extending a few index vectors; similarly, adding an edge involves extending some vectors and recreating some indices. The Vertex and Edge objects you see in Python are actually dead simple proxies that store a reference to the graph they are a part of, plus the index of the vertex or edge that they represent. Creating such a proxy object just for the sake of returning it is an overhead, especially considering that they will be thrown away immediately.

You are right that we could at least return the index of the newly created edge or vertex; we do not do this because igraph simply allocates the next available vertex or edge ID. So, if you have n vertices and add one more, the newly created vertex will have ID=n (and the same for edges).

> If I am reading a large graph file (special-format, let's assume), checking if a vertex (or an edge between two vertices) has been already created is really problematic.
How about using g.are_connected(source, target)? This method automatically accepts vertex names and converts them to vertex IDs.

> I am assuming that "find" takes O(n) time where n is number of nodes. So this is an O(n*n) time operation.

True, but most methods accept vertex names in place of vertex IDs, and they use an internal dictionary that maps vertex names back to vertex IDs to do the conversion. So, for instance, using g.vs.find(name=“whatever”).degree() is O(n), but using g.degree(“whatever”) is O(1). This is because the find() method cannot assume that the VertexSeq it is called on represents the entire vertex set (it might represent a subset or a permutation of it) so we cannot use a plain dictionary lookup.

For what it’s worth, my goal is to allow the usage of vertex names in place of vertex IDs for every igraph method that accepts vertices or lists of vertices. Some of the methods have slipped through the cracks, so for instance g.get_eid(“A”, “B”) does not work yet (even though g.are_connected(“A”, “B”) does). If you find a method that should accept vertex names but does not accept it yet, please file a bug report at https://github.com/igraph/igraph and I’ll fix it.

By the way, if you are constructing a large graph with repeated calls to add_vertex() and add_edge(), then you will probably encounter another bottleneck as your graph grows. igraph’s data structures are optimized for static graphs (i.e. graphs where vertex and edge mutations are far less frequent than queries). Every time you add an edge, some internal data structures have to be rebuilt from scratch, which effectively means that it is almost as fast to add, say, thousands of edges at the same time as the addition of a single edge. In most cases, it is way faster to create the edge list of your graph in advance and then add all the edges at once.

Alternatively, you may take a look at the Graph.DictList() constructor, which takes two iterables, one for the vertices and one for the edges. Both iterables should yield dictionaries; keys of the dictionaries are converted to vertex/edge attribute names and the corresponding values are converted to vertex/edge attribute values. The “source” and “target” keys in the dictionaries corresponding to edges are automatically used to look up the source and target vertex efficiently, and Graph.DictList() is also capable of “batching” edge additions in larger chunks to alleviate the overhead of repeated calls to add_edges().

All the best,
T.


_______________________________________________
igraph-help mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/igraph-help


reply via email to

[Prev in Thread] Current Thread [Next in Thread]