igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Performance issue regarding when calculating induced_subgra


From: AaaSDFfff
Subject: Re: [igraph] Performance issue regarding when calculating induced_subgra
Date: Mon, 2 May 2016 17:31:19 +0200 (CEST)

Hi Tamás,
 
First of all thank you for your reply and again I also would like to thank you for the personal consult. We tried your solution and after we removed the name attribute from our graph it seems like the calculations will be done within a reasonable time. However, when we run the exact codes of yours we got quite different results:
 
> n = 15000
> radius = 0.2 / ((n/100) ** 0.5)
> g = grg.game(n, radius)
> cl = label.propagation.community(g)
> system.time(lapply(groups(cl), function(x){induced.subgraph(g, x)}))
   user  system elapsed
   1.14    0.00    1.14
> V(g)$name = 10000000:(10000000+n-1)
> cl = label.propagation.community(g)
> system.time(lapply(groups(cl), function(x){induced.subgraph(g, x)}))
   user  system elapsed
   8.86    0.08    8.93
> V(g)$name = sapply(10000000:(10000000+n-1), toString)
> cl = label.propagation.community(g)
> system.time(lapply(groups(cl), function(x){induced.subgraph(g, x)}))
   user  system elapsed
   1.46    0.04    1.51
 
We got the biggest slowdown when the type of the name attribute was numeric instead of using string attribute as you did at the third time.
 
And there is another odd thing using the authority.score function. First, below you can see our script :
 
# deleting variables
rm(list=ls())
 
# if not installed then install.packages("igraph")
# if not installed then install.packages("plyr")
library("igraph")
library("plyr")
 
# set working diractory
setwd("********************")
 
# reading and creating graph
g_in = read.csv("SNA_05_Net.csv", sep=" ")
g = graph.data.frame(g_in, directed=FALSE)
 
node_list = as.matrix(as.numeric(V(g)$name))
write.table(node_list, file="SNA_Node_List.csv", quote=FALSE, sep="§", col.names=FALSE)
 
g = remove.vertex.attribute(g, "name")
 
# creating clusters -- cluster_optimal?
clust = groups(cluster_label_prop(g, weights=E(g)$weight))
 
# exporting clusters
cl = ldply(clust, data.matrix)
write.table(as.matrix(cl), file="sna_R.csv", quote=FALSE, sep="§", col.names=FALSE)
 
# creating sub-graphs
g_sub = lapply(clust, function(x){induced.subgraph(g, x)})
 
# creating cluster/subgraph KPIs
auth = lapply(g_sub, function(x){authority_score(x)$vector})
auth_scr = ldply(auth, data.matrix)
 
write.table(as.matrix(auth_scr), file="sna_R_ath_scr.csv", quote=FALSE, sep="§", col.names=FALSE)
 
edg = lapply(g_sub, function(x){edge_density(x, loops=FALSE)})
edg_dens = ldply(edg, data.matrix)
 
write.table(edg_dens, file="sna_R_edg_dens.csv", quote=FALSE, sep="§", col.names=FALSE)
 
dgr_o = lapply(g_sub, function(x){degree(x, mode=c("out"), loops=FALSE)})
dgr_out = ldply(dgr_o, data.matrix)
 
write.table(as.matrix(dgr_out), file="sna_R_dgr_out.csv", quote=FALSE, sep="§", col.names=FALSE)
 
dgr_i = lapply(g_sub, function(x){degree(x, mode=c("in"), loops=FALSE)})
dgr_in = ldply(dgr_i, data.matrix)
 
write.table(as.matrix(dgr_in), file="sna_R_dgr_in.csv", quote=FALSE, sep="§", col.names=FALSE)
 
eigv = lapply(g_sub, function(x){eigen_centrality(x)$vector})
eigv_cent = ldply(eigv, data.matrix)
 
write.table(as.matrix(eigv_cent), file="sna_R_eigv_cent.csv", quote=FALSE, sep="§", col.names=FALSE)
 
el = lapply(g_sub, function(x){as_edgelist(x)})
edg_list = ldply(el, data.matrix)
 
write.table(as.matrix(edg_list), file="sna_R_subgraps.csv", quote=FALSE, sep="§", col.names=FALSE)
 
Everything is working fine except the „lapply(g_sub, function(x){authority_score(x)$vector})” statement, because there is one group where the authority_score function fails. This cheeky bastard is the number 293863 cluster. If I run the „lapply(g_sub[1:293862], function(x){authority_score(x)$vector})” or the „lapply(g_sub[293864:length(g_sub)], function(x){authority_score(x)$vector})” statements they are working fine but when I run the „lapply(g_sub, function(x){authority_score(x)$vector})” statement I got the same error message when I run „lapply(g_sub[293863], function(x){authority_score(x)$vector})”. This is the error message:
 
„Error in .Call("R_igraph_authority_score", graph, scale, weights, options,  :
  At arpack.c:944 : ARPACK error, No shifts could be applied during a cycle of the Implicitly restarted Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV”
 
I made a google search to understand what causes the probem, but I didn’t find anything useful. Maybe I can find something in the arpack manual but I definitely need more time for that. Here are the details about the subgraph of group 293863:
 
> g_sub[293863]
$`293863`
IGRAPH U-W- 4 3 --
+ attr: weight (e/n)
+ edges:
[1] 1--2 1--3 1--4
 
> E(g_sub[[293863]])$weight
[1] 270 5677 3032
 
I don’t see why the authority_score function can’t run on that kind of graph (this is a classical star schemed graph and I think there are many of them because there are about 440 000 clusters/subgraphs)
I hope you can send us some kind of solution for this problem. Thanks in advance.
 
Best regards,
Adam Sohonyai

reply via email to

[Prev in Thread] Current Thread [Next in Thread]