lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LZSS is not LZ77


From: Antonio Diaz Diaz
Subject: Re: LZSS is not LZ77
Date: Tue, 10 Oct 2023 18:42:27 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

wrotycz wrote:
So, in my humble opinion, to do be precise and do justice to creators of
the scheme, namely Storer and Szymanski, in Readme it would be good to
state about LZSS rather than LZ77 as they have as much in common as
Stevenson's locomotive and TGV.

I have never seen neither the papers from Lempel and Ziv nor the paper from Storer and Szymanski, and most probably Igor Pavlov (the inventor of LZMA) has not seen them either. One could say that LZMA is based on the LZ77 concept of deduplicating strings of text, rather than in the specifics of LZ77 (or LZSS) itself.

In fact, all one needs to know to develop a LZ-style compression algorithm is the following idea:

https://en.wikipedia.org/wiki/LZ77
"LZ77 algorithms achieve compression by replacing repeated occurrences of data with references to a single copy of that data existing earlier in the uncompressed data stream. A match is encoded by a pair of numbers called a length-distance pair, which is equivalent to the statement "each of the next length characters is equal to the characters exactly distance characters behind it in the uncompressed stream"."

IMO the pseudocode at https://en.wikipedia.org/wiki/LZ77#Pseudocode describes well the fast variant of LZMA used by lzip. You may compare it with the description of lzip at https://datatracker.ietf.org/doc/html/draft-diaz-lzip-08#section-3.5:

"The fast encoder in the [lzip] reference implementation shows how this can be done in almost the simplest way possible; issuing the longest match found, or a literal byte if no match is found, and repeating until all the data have been compressed."

Moreover, https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Storer%E2%80%93Szymanski says: "The main difference between LZ77 and LZSS is that in LZ77 the dictionary reference could actually be longer than the string it was replacing."

which is an important feature of LZMA. So I tend to think that LZMA has more in common with LZ77 than with LZSS.

3. CSC578B: Topics in Software Applications: "Data Compression"
https://heat.csc.uvic.ca/coview/course/2020051/CSC578B
Data Compression (Summer 2020) - Lecture 10 - Lempel-Ziv Schemes

All I can see in the link above is the message:
"We are sorry, but your web browser must have JavaScript enabled in order for this application to operate correctly."

Antonio.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]