emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Orgmode] Re: [babel] R questions


From: Sébastien Vauban
Subject: [Orgmode] Re: [babel] R questions
Date: Tue, 08 Dec 2009 10:50:15 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)

Hi Dan,

Dan Davison wrote:
> Sébastien Vauban <address@hidden> writes:
>>
>> I have this table generated by a script:
>>
>> #+results: abc2008
>> | "2008/1"  | -78.59 |   1627.24 |
>> | "2008/2"  | -80.17 |    700.33 |
>> | "2008/3"  | -80.17 |     879.8 |
>> | "2008/4"  | -80.17 | -25823.17 |
>> | "2008/5"  | -80.17 |   3570.75 |
>> | "2008/6"  | -81.77 |    2377.8 |
>> | "2008/7"  | -81.77 |    2889.4 |
>> | "2008/8"  | -81.77 |   2612.92 |
>> | "2008/9"  | -81.77 |   1585.21 |
>> | "2008/10" |  -83.4 |   1561.42 |
>> | "2008/11" |  -83.4 |   2189.17 |
>> | "2008/12" |     "" |        "" |
>>
>> I want to draw the 12 months with the values side by side.
>>
>> Problem #1: the "" in the last line hinder the generation of the graph.
>> Format error.
>
> Missing values in R are represented by the value NA. If you change the last
> line of your table to
>
> | "2008/12" |     NA |        NA |
>
> then it works [1], [2], [3].
>
> [1] Note no quotes around NA here. You asked a good question about quoting
>     in org-babel; it will be answered.

OK.


> [2] I guess one could potentially think about dealing with missing values
>     more explicitly in org-babel. E.g. there could be a header arg
>     specifying what values are to be treatyed as missing. Nothing like that
>     exists currently.

I guess such a feature would be required on the long term. Of course, even
specifying what would be the needed behavior is already difficult, I think.
One must have good knowledge of the multiple languages and environments, and
try to abstract the best behavior out of these.

Side note -- I know, for example, that there is an option in Access to let it
consider the empty string ('') as the NULL value, or not. Clear.

But what's a "NA" value in general?  Is 0 always a meaningful value as
numeric?  Context-sensitive...

Side question -- You talked of some way to remember the bugs or features to be
added to Org. Same question here: where will these little things be added in
order to avoid forgetting them?  Is it in one of the Worg documents itself?


> [3] You might think that an alternative would be to do something like this
>     in R
>
> abc[abc == "\"\""] <- NA
>
> but the trouble is that with those double quotes present, R will interpret
> the column as containing character data rather than numeric, and things will
> not be pretty.

I believe you...


>> #+srcname: expenses-bar-plot(abc = abc2008)
>> #+begin_src R :results file :file abc2008.pdf
>>     barplot(abc[,3], col = "red", main = "Profit and Loss 2008", las = 1, 
>> xlab
>>     = "Months", ylab = "EUR")
>> #+end_src
>>
>> Problem #2: I don't know how to ask for drawing the 2 columns. I've tried
>
> OK, so one point that is arguably relevant to this mailing list is that when
> org tables are read into R, the object that is created in R is a *data
> frame*. Not a matrix. (A data frame can have columns of different types;
> matrices are all one type). [4]
>
> [4] org-babel uses orgtbl-to-tsv followed by read.table() to convert the
> org table into a data.frame in R. A source of much confusion with
> R-beginners is that by default, read.table converts character columns into
> the *factor* data type. Note that org-babel currently uses 'as.is=TRUE' when
> calling read.table and therefore does *not* convert to factor. This may
> avoid some confusion among users but is memory-inefficient and misses out on
> other advantages of factors.
>
> So to solve your problem, you'd need to read the description of the height
> argument in the help page for barplot (?barplot), noting that it says
> "either a vector or matrix", and also noting that it says that bars
> correspond to columns (not rows), thus realising that you need to explicitly
> convert the relevant columns of the data frame to a matrix and then
> transpose.
>
> However, your two columns have rather different magnitude values and so are
> not very well suited for plotting on the same scale. Below I rescaled the
> first column by a factor of 20 so you can at least see the bars.
>
> #+srcname: expenses-bar-plot-two-columns(abc = abc2008)
> #+begin_src R :file abc2008.png
>   ## select the two columns, convert to matrix, transpose and rescale top
>   ## row.
>   x <- t(as.matrix(abc[,2:3])) * c(20,1)
>   barplot(x, col = rep(c("red","blue"), ncol(x)), main = "Profit and Loss
>   2008", las = 1, xlab= "Months", ylab = "EUR", beside=TRUE)
> #+end_src

Thanks a lot for the enlightened explanation, and the correction to be brought
to the R code.

Best regards,
  Seb

-- 
Sébastien Vauban





reply via email to

[Prev in Thread] Current Thread [Next in Thread]