octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: huffmandict: Matlab compatibility, Matlab bugs..


From: Muthiah Annamalai
Subject: Re: huffmandict: Matlab compatibility, Matlab bugs..
Date: Mon, 09 Oct 2006 11:24:47 -0500

> From: Simon Persson <address@hidden>
> Subject: Re: huffmandict: Matlab compatibility, Matlab bugs..
> To: address@hidden
> Message-ID: <address@hidden>
> Content-Type: text/plain; charset="utf-8"
> 
> On Friday 06 October 2006 23:12, Muthiah Annamalai wrote:
> > > From: Simon Persson <address@hidden>
> > > Subject: Re: huffmandict: Matlab compatibility, Matlab bugs..
> > > To: address@hidden
> > > Message-ID: <address@hidden>
> > > Content-Type: text/plain; charset="utf-8"
> > >
> > > On Thursday 05 October 2006 21:59, Muthiah Annamalai wrote:
> > > > Hi there,
> > > > Im trying to write some functions in Matlab communication toolbox
> > > > (source coding functions) for Octave forge.
> > > >
> > > > While I was doing that, I came across a funny bug in Matlab.
> > > >
> > > > Huffman encoding is done in 2 steps (according to Matlab),
> > > > and the first step generates dictionary.
> > > > Essentially all Huffman coding does it map low probability symbols
> > > > to long codewords and vice-verca.
> > > >
> > > >  So the wierd thing (in Matlab impl) is that for 0-probability symbols
> > > > have long code words assigned to them. I wonder... why someone will
> > > > infact even have a zero probability symbols occur in their data?
> > >
> > > Maybe someone wants to estimate probabilites from a smaller sample than
> > > the full data to be encoded. Then codes for all symbols are still needed
> > > even if they don't occur in the analysed sample. That's the only reason I
> > > can think of.
> > >
> > > Simon
> >
> > I dont understand you. Huffman codes are supposed to be *static* and
> > *apriori* code words. Im not talking about adaptive Huffman codewords
> > here. To me that is plain B.S, I asked with a friend and he doesnt
> > differ much; also I wouldnt know how to do such a thing anyway .
> > Anyone who can 'make it compatible' is welcome to.
> >
> 
> If you want to compress an ascii text file and for speed reasons you want to 
> generate your codes based on the relative frequencies of symbols just in the 
> first paragraph. You still want to have codes for all the ascii codes that 
> may come later in the file.
> 
> I have never heard of anyone doing this, just imagining that this might be 
> useful in some situation.
> 
> Simon
Hi there,

I think there is more than a grain of reason in doing the zero-symbol
having codewords, but I somehow fail to appreciate this idea personally,
but more than one person has pointed to me the 'need' for it, and 
being a symbol in the tree hardly makes a difference.

So I have deleted the section in the code that deletes the zero
probability symbols, and that way we can hopefully be happy.

Again, I am sorry to have upset more than one person.
So it is indeed not a bug.

New code at:
http://octave.cvs.sourceforge.net/octave/octave-forge/main/comm/inst/huffmandict.m?revision=1.7&view=markup

Thanks,
Muthu






reply via email to

[Prev in Thread] Current Thread [Next in Thread]