octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: huffmandict: Matlab compatibility, Matlab bugs..


From: Simon Persson
Subject: Re: huffmandict: Matlab compatibility, Matlab bugs..
Date: Sat, 7 Oct 2006 01:02:37 +0200
User-agent: KMail/1.9.4

On Friday 06 October 2006 23:12, Muthiah Annamalai wrote:
> > From: Simon Persson <address@hidden>
> > Subject: Re: huffmandict: Matlab compatibility, Matlab bugs..
> > To: address@hidden
> > Message-ID: <address@hidden>
> > Content-Type: text/plain; charset="utf-8"
> >
> > On Thursday 05 October 2006 21:59, Muthiah Annamalai wrote:
> > > Hi there,
> > > Im trying to write some functions in Matlab communication toolbox
> > > (source coding functions) for Octave forge.
> > >
> > > While I was doing that, I came across a funny bug in Matlab.
> > >
> > > Huffman encoding is done in 2 steps (according to Matlab),
> > > and the first step generates dictionary.
> > > Essentially all Huffman coding does it map low probability symbols
> > > to long codewords and vice-verca.
> > >
> > >  So the wierd thing (in Matlab impl) is that for 0-probability symbols
> > > have long code words assigned to them. I wonder... why someone will
> > > infact even have a zero probability symbols occur in their data?
> >
> > Maybe someone wants to estimate probabilites from a smaller sample than
> > the full data to be encoded. Then codes for all symbols are still needed
> > even if they don't occur in the analysed sample. That's the only reason I
> > can think of.
> >
> > Simon
>
> I dont understand you. Huffman codes are supposed to be *static* and
> *apriori* code words. Im not talking about adaptive Huffman codewords
> here. To me that is plain B.S, I asked with a friend and he doesnt
> differ much; also I wouldnt know how to do such a thing anyway .
> Anyone who can 'make it compatible' is welcome to.
>

If you want to compress an ascii text file and for speed reasons you want to 
generate your codes based on the relative frequencies of symbols just in the 
first paragraph. You still want to have codes for all the ascii codes that 
may come later in the file.

I have never heard of anyone doing this, just imagining that this might be 
useful in some situation.

Simon

Attachment: pgpEtjSaANWLj.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]