You are right that doing array allocations in any language is slow, and in order to get some kind of baseline to compare things to, I implemented the same algorithm in Lisp, and I did so in the worst possible way: By completely reallocating the array for each row, and copying the old content into the new one (code in the end of this message). I did it that way because that is what the APL code ends up doing, so I didn't want to give the Lisp program any benefits.This was, as you may imagine, very slow. It took 51 seconds to load the entire file.
The APL version that I posted in my previous message takes
Now, for a solution. Reading the file twice is quite ugly, and I'd rather avoid doing so.
I could rewrite the Lisp program to in-place reallocation of the array, which would speed up the program by several orders of magnitude. The problem is that there is no mechanism in GNU APL to do the same. I think adding a new construct to the language to do that would be ugly, but perhaps there is a way for GNU APL to detect the specific “append to the end of an array” idiom so that it could use an optimised path that doesn't require constructing a new array.
Jürgen, is that even feasible?
Now, reading a CSV file like this is a very common case, and I am of the opinion that having a dedicated native function for this purpose would solve perhaps 75% of use cases where this is an issue. I'll be happy to implement a flexible function that can do this, which would make GNU APL much more pleasant to use for data analysis.
Jürgen, if I were to implement this, would you prefer it to be a dynamically loaded library, or would you prefer having it as a quad-function?