bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13949: 24.4.1; `fill-paragraph' should not always put the buffer as


From: Dmitry Gutov
Subject: bug#13949: 24.4.1; `fill-paragraph' should not always put the buffer as modified
Date: Mon, 28 Mar 2016 00:20:08 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0

On 03/28/2016 12:05 AM, Óscar Fuentes wrote:

I guess that the extra bits of entropy (160 vs 128) was a "fuzzy-warm"
factor too on using SHA-1 instead of MD5. Git must avoid collisions
among potentially hundreds of millions of objects (repos with that size
already exists or will exist on the near future.)

Are there fewer different texts we'd have to be able to discern?

Each and every hash
must be different from all the others and hence avoid the Birthday
Problem. Anyway, 128 bit hashes still would be good enough for those
huge repos. fill-paragraph needs to discriminate only between 2 chunks
of data.

I think you mean "2 chunks of data that must only be different in positioning and presence of newlines". Then yes, the odds of a collision must be slim. Still, I haven't seen (or performed) a sufficient analysis to evaluate them.

b) Git has a global object index. It _can_ detect collisions, or at
least that detection can be implemented.

And what to do when a collision is detected?

Abort the current operation? Wait 50ms and retry creating the commit? Not 100% how the file contents are indexed: e.g. whether mtime factors into its hash value, too.

Back to the topic, your suggetion about comparing the pre- and post-
contents of the paragraph (and avoiding huge copies of the pre- contents
by restricting the copied area to the paragraph itself) does not work
when the file contains just one paragraph. Try visiting a big CSV dump
or log and press M-q. You can abort the operation with C-g, but if Emacs
starts to swap like crazy or exceeds the process memory limit and it is
killed...

You can choose to skip the "did it changed" check if the region to check is too long. If the dump was one huge line, we can be confident that it will be changed upon filling.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]