igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] cut_prob in igraph_motifs_randesu_estimate


From: Eric Weese
Subject: Re: [igraph] cut_prob in igraph_motifs_randesu_estimate
Date: Thu, 4 Sep 2008 15:42:46 -0400

Hmm... I have to admit I'm still a bit confused. I'm just interested in counting all connected subgraphs of a given size (without any categorizing), and I think the standard RAND-ESU algorithm does what I want. But under "plain vanilla" RAND-ESU, the (expected) fraction of subgraphs enumerated should always be

prod(1 - cut.prob)

right? When I try that with graph.motifs.est, though, I get different results for cut.prob = c(0, 0, 0.5) and cut.prob = c(0, 0.5, 0), even though prod(1 - cut.prob) is the same in both cases. Is there a different function that will give the standard RAND-ESU behaviour, or am I just misunderstanding how RAND-ESU is supposed to work?

Thanks...

-Eric


Date: Sat, 30 Aug 2008 13:14:07 +0200
From: Csardi Gabor <address@hidden>
Subject: Re: [igraph] cut_prob in igraph_motifs_randesu_estimate
To: Help for igraph users <address@hidden>
Message-ID: <address@hidden>
Content-Type: text/plain; charset=us-ascii

Eric, thanks for your mail, sorry for the delay, I was away for a while.

I think the igraph code is correct, 'cut.prob' gives the cutting
probability at each level of the search tree. In other words, we _stop_
with this probability.

Your second test case is simple, it indeed finds about half of the motifs. The third test case is more complicated, however. I could not calculate
the expected number of motifs with pen and paper, and it might depend
on how "clustered" the triads are in your graph.

See e.g.:

g1 <- graph.star(100, mode="undirected")
g2 <- graph.ring(100, circ=FALSE)
graph.motifs.no(g1)
[1] 4851
graph.motifs.no(g2)
[1] 98
mean(replicate(100, graph.motifs.est(g1, sample=V(g1), cut.prob=c (0,0.5,0))))
[1] 29.08667
mean(replicate(100, graph.motifs.est(g2, sample=V(g2), cut.prob=c (0,0.5,0))))
[1] 29.01

The purpose of having 'cut.prob' for 'graph.motifs.est' is that
you might want to know how many motifs you expect to find with
'graph.motifs', without actually classifying them, because much of the
running time is the classification part. (Especially if bigger
motifs will be supported some time in the near future.)

Best,
Gabor

On Tue, Aug 26, 2008 at 02:17:11PM -0400, Eric Weese wrote:
Hi all -

I'm getting some results that I don't understand using the R
interface to igraph:

test.graph <- graph.formula(D-A-B-G, E-A-C-I, F-B-C-H)
replicate(10, graph.motifs.est(test.graph,sample=V
(test.graph),cut.prob=c(0,0,0)))
replicate(10, graph.motifs.est(test.graph,sample=V
(test.graph),cut.prob=c(0,0,0.5)))
replicate(10, graph.motifs.est(test.graph,sample=V
(test.graph),cut.prob=c(0,0.5,0)))

the first two sets of results make sense to me (cutting half the time
reduces the estimate in half), but the third one seems way too low.
I see the following code around line 421 of motifs.c:

      if (level < size-1 &&
          !igraph_vector_empty(&adjverts) &&
          (cp==0 || RNG_UNIF01() > cp)) {
        /* yes, step down */
...

but this means if the random number is low, then we go to "else".
Should this be


      if (level < size-1 &&
          !igraph_vector_empty(&adjverts)) {
        /* yes, step down */
        long int neifather=igraph_vector_pop_back(&adjverts);
        long int nei=igraph_vector_pop_back(&adjverts);
        IGRAPH_CHECK(igraph_stack_push(&stack, neifather));
        IGRAPH_CHECK(igraph_stack_push(&stack, nei));
        IGRAPH_CHECK(igraph_stack_push(&stack, level+1));
        if(cp==0 || RNG_UNIF01() > cp) {
          IGRAPH_CHECK(igraph_vector_push_back(&vids, nei));
...

instead?  This seems to work better for me...


Thanks!

-Eric
 (PhD candidate, Economics, MIT)


_______________________________________________
igraph-help mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/igraph-help




reply via email to

[Prev in Thread] Current Thread [Next in Thread]