Thanks, I agree.
It fails for > 4GB.
The msvcrt.dll!fwrite(buff, (4G+0x1000), 1, file) works as following:
(based on assembly, not the original source, irrelevant parts have been omitted)
size_t fwrite(const void* buff, size_t element_size, size_t element_count, FILE* file_p)
{
size_t size_t_total_size = element_size * element_count;
size_t size_t_remain_size = size_t_total_size;
unsigned int u32_written_bytes;
unsigned int buf_size;
/* The register used is r12d but not r12.
* So I suspect that Microsoft wrote it as an unsigned int type
* (or msvc compiler bug? seems unlikely)
*/
unsigned int u32_chunk_size;
while (true) {
if ((file_p->flags & 0x10C) != 0) {
buf_size = file_p->buf_size;
}
else {
// Always reaches here on the first fwrite() after fopen().
buf_size = 4096; // mov r15d, 1000h
}
if (size_t_remain_size > buf_size) {
u32_chunk_size = size_t_remain_size;
if (buf_size) {
// div ... ; sub r12d,edx; size_t stored into r12d , lost high 32 bits
// u32_chunk_size = 0x100000FFF - 0x100000FFF % 0x1000
// = 0x100000FFF - 0xFFF
// = 0x1 0000 0000
// = (u32) 0x1 0000 0000
// = 0
u32_chunk_size = size_t_remain_size - (size_t_remain_size % buf_size);
}
//call _write() with zero size, returns 0
u32_written_bytes = __write(file_p, data_buff, u32_chunk_size);
// They didn't check if __write() returns 0.
if (u32_written_bytes == -1 || u32_written_bytes < u32_chunk_size) {
return (size_t_total_size - size_t_remain_size) / element_size;
}
//size_t_remain_size -= 0
size_t_remain_size -= u32_written_bytes;
buff += u32_written_bytes;
//size_t_remain_size will never decrease to zero, then while(true) loop forever.
if (size_t_remain_size == 0) {
return element_count;
}
// ...
}
else {
// ...
}
}
// ...
}
According to current analysis, size <= 4GB all are safe, however there are many
versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows 11(all
with full latest updates), and it may also exist in other versions, but it is difficult to
check each version individually. I am not sure if all versions handle boundary sizes
like 2GB/4GB correctly. So I prefer a relatively conservative value: 128MB.
Maybe we could use #ifdef _WIN32 to differentiate the handling between Linux and
Windows. For Linux, it remains unchanged, while for Windows, it processes by chunks
with max_chunk_sizeto 1GB.