Re: imread on large tiff

help-octave
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: imread on large tiff

From:	John Hayes
Subject:	Re: imread on large tiff
Date:	Fri, 17 Jan 2014 00:50:47 +0100
Le 17 janv. 2014 à 00:01, Carnë Draug a écrit :

> Please always include the mailing list when replying so others can
> read it in the future or chime in to give further help and advice.
D’oh, I thought I hit reply-all, and I usually check that but clearly didn’t in 
this case. Apologies all around...

> On 16 January 2014 22:25, John Hayes <address@hidden> wrote:
>> Le 16 janv. 2014 à 19:17, Carnë Draug a écrit :
>> 
>>> On 15 Jan 2014 20:58:19 +0100,  John Hayes <address@hidden> wrote:
>>>> Hi all,
>>>> 
>>>> I have a problem where if I try either imfinfo or imread on a 640x540x1800 
>>>> TIFF file that is ~1.2 GB, Octave hangs seemingly indefinitely. This is 
>>>> occurring with both GraphicsMagick 1.3.18 and 1.3.19. I?m on OS X 10.6.8 
>>>> with gcc 4.8.2 (I built) and octave 3.8.0 downloaded on Dec. 27, 2013. I?m 
>>>> using the filename as argument to imfinfo and for imread, I use this:
>>>>> video=zeros(numFrames, numRows, numCols, 'single');
>>>>> disp('Reading movie file...')
>>>>> for i=1:numFrames
>>>>>   disp(i);
>>>>>   video(i,:,:)=imread(fileName,i);
>>>>> end
>>>> 
>>>> At least for imread, it seems to be hanging somewhere in the 
>>>> __magick_read__ function, but I haven?t dug into it too deeply yet. I?ve 
>>>> extracted the first 10 frames of the TIFF using ImageJ, moved it into the 
>>>> location of the big file, and reran the script on this file. In this case, 
>>>> they seem to work. This suggests the problem may not be that it?s hanging 
>>>> but that it?s just going really slow. But it?s not immediately obvious to 
>>>> me why imfinfo would be so slow (or imread for that matter).
>>>> 
>>>> Btw, GraphicsMagick was configured with the following command:
>>>>> ./configure --with-quantum-depth=32 --enable-shared --disable-static 
>>>>> --with-magick-plus-plus=yes
>>>> 
>>>> As a final note, when I Ctrl-C to stop Octave, it takes a long time to 
>>>> actually quit. If I hit Ctrl-C multiple times, Octave reports the 
>>>> following:
>>>>> ^C^C^Cpanic: Interrupt -- stopping myself...
>>>>> attempting to save variables to 'octave-workspace'...
>>>>> ^Cpanic: attempted clean up apparently failed -- aborting...
>>>> and dumps a HUGE core file to /cores/ (>6.5 GB!!). In the process of 
>>>> writing the core file, it clearly affects the general I/O on the computer 
>>>> (to the extent I often have to reboot as sudo kill -9 won?t stop it), so 
>>>> there seems to be ?something' gobbling up a lot of memory. :)
>>>> 
>>>> Thanks for any insight you can provide on how to solve the main problem 
>>>> (imfinfo/imread usage with a 1.2 GB TIFF). Any clues as I dig deeper would 
>>>> be most helpful...
>>>> 
>>>> Best regards,
>>>> 
>>>> John
>>> 
>>> Hi John
>>> 
>>> I would not recommend using that syntax to read such a large file. Use
>>> 
>>> video = imread (filename, "Index", "all")
>>> 
>>> or
>>> 
>>> video = imread (filename, "Index", 1:numFrames);
>>> 
>>> The reason it's so slow is on the nature of multipage tiff. If you
>>> want to find where the image #100, you need to go through the previous
>>> 99 images. You have 1800 pages so... you get what I mean. By using
>>> this syntax, Octave will read everything in one go.
>> Hi Carnë,
>> 
>> Thanks for the advice -- that’s been very helpful. Since sending my email I 
>> saw where you had previously had a similar problem a few years ago 
>> (http://octave.1599824.n4.nabble.com/imread-long-time-with-large-multipage-tif-td4637026.html),
>>  and I had been digging through the code trying to figure out a solution 
>> before that (I didn’t think to call the multiframe tiff ‘multipage’ when 
>> googling).
> 
> I the multipage nomenclature comes from the TIFF specifications and I
> think it's from the used in faxes.
Thanks, that’s very informative.

>>> You mention using imfinfo. I'm assuming you're doing this to get the
>>> number of rows, columns and frames in your image. imfinfo will return
>>> a struct array with a lot of fields for each frame in your file. Note
>>> that it is possible for each page on your TIFF to have different info,
>>> even different size, we can't just deduce all that from the first
>>> page. And that is slow.
>> Yes, I’ve decided to bypass it altogether and just use my known values for 
>> the dimensions since imfinfo seems to read the whole file just like imread 
>> in ./libinterp/dldfcn/__magick_read__.cc.  That is the ‘read_file’ function 
>> is unnecessarily called MANY times for me, but from the discussion it 
>> doesn’t seem like ImageMagick/GraphicsMagick really supports a better 
>> mechanism (and this approach sounds schlocky at best: 
>> http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=13439). So, 
>> your suggestion has been a great workaround.
> 
> It's not a workaround, it's the documented usage (not in Matlab which
> only supports this for gif files). You will have a problem if the
> pages are not all equal in size and bit depth though.
> 
> On the defense of GraphicsMagick, there's not much that they can do.
> It's just how the TIFF format works, you get the pointer for the each
> page at the end of the previous one. What Matlab did (according to
> their documentation) is to read the whole image once with imfinfo
> which returns an array with the start location of each page in the
> file. This can be passed as an extra argument to imread so it knows
> where to start.
OK, Matlab may not document this well. But the code I was using was from 
someone that was using Matlab on Linux (I don’t know the version off-hand, but 
a recent one). I had noticed that Matlab’s documentation was very unspecific in 
my original usage for a TIF file (but that a Linux Matlab-user devised) so 
something must be in flux over there (or Mathworks' online docs are simply out 
of sync with the reality of their recent versions). 

The coding in Octave seems to be very defensive, which I agree with is good 
especially for the variety of formats these functions are intended to support. 

But realistically, I wonder, who has a .tif or .gif with multiple depths and 
frame dimensions within a single file and want to use it with this function? 

That just sounds very bizarre to me and a VERY weird special case that >99% of 
users of imread would never have need of. It’s not a complaint on the 
implementation, more of a complaint if Matlab actually can handle this because 
it sounds crazy to me. 

I’m sorry, I don’t mean any disrespect to anyone because maybe some people find 
this useful that I’m not aware of (but I would be interested in); it just seems 
to me like if the dimensions change frame-to-frame one should change the file 
it’s stored in. I’m racking my brain on this, but the only example I can think 
of someone that would find this useful is if someone was converting a 
presentation to .pdf, then converting a .pdf to .tif, where each slide may be 
slightly different dimensions. But even that sounds like a bad idea to me... 
Maybe that’s a flaw with the TIFF format though that I was previously unaware 
of...

Personally, I still think the fundamental problem is with GraphicsMagick++ as 
this link seems to indicate they (or the original ImageMagick++) have the 
facility for accessing individual or range of frames/pages: 
http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=13439
It seems they don’t have an easily accessible API function for it though which 
busts the usability for Octave... And presumably that doesn’t extend to other 
formats, which I further presume is the principal reason for using 
GraphicsMagick++ to begin with for us!

If others agree, I’ll hop on the GraphicsMagick mailing list and inquire about 
the lack of access through the API problem (I think) we’re having. And 
hopefully work towards it as best I can....

But, as I said, I mean no disrespect to the implementers of Octave, because the 
imread function looks like a beast to implement for many file formats fairly; 
I’m just curious about it and suspect that Mathworks does a lot of shady stuff 
« just to make it work » in special cases (for their paying customers I’m sure 
:)...

>>> Also, because the way GraphicsMagick works, if you built it with
>>> quantum-depth 32, images are read as uint32. Octave then resscales the
>>> values and casts the image to the correct data type. This means that
>>> if your image was uint8, it may temporarily take up 4 times its
>>> original size. And that's considering that the image in your tiff file
>>> is not compressed. So a careful choice of the options used to build
>>> GraphicsMagick should be made if you're dealing with such unusual
>>> cases as 1.2GB images. Maybe a quantum-depth of 16 will be enough for
>>> you.
>> Yes, I’ve rebuilt GraphicsMagick with a depth of 16 since that’s what I’m 
>> using from ImageJ (and uncompressed as well).
>> 
>> Thanks a lot, and best regards,
[Prev in Thread]
Current Thread
[Next in Thread]
imread on large tiff, John Hayes, 2014/01/15
- Re: imread on large tiff, Carnë Draug, 2014/01/16
  - Message not available
    - Message not available
    - Re: imread on large tiff, John Hayes <=
    - Re: imread on large tiff, Carnë Draug, 2014/01/17
Prev by Date: montage function
Next by Date: Re: newplot function udefined error
Previous by thread: Re: imread on large tiff
Next by thread: Re: imread on large tiff
Index(es):
- Date
- Thread