qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] contrib/elf2dmp: a workaround for the buggy msvcrt.dll!fwrit


From: hellord
Subject: Re: [PATCH] contrib/elf2dmp: a workaround for the buggy msvcrt.dll!fwrite
Date: Wed, 10 Jul 2024 16:02:18 +0800


 
On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell <peter.maydell@linaro.org> wrote:

> +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)

I think we could add a comment here, something like:

/*
 * Maximum size to fwrite() to the output file at once;
 * the MSVCRT runtime will not correctly handle fwrite()
 * of more than 4GB at once.
 */
That will act as a reminder about why we do it.


Thanks, I agree.
 
(Does the library fwrite fail for > 4GB, or for >= 4GB ?
Your commit message says the former, so I've gone with that,
but if it's an "overflows 32 bit variable" kind of bug then
4GB exactly probably also doesn't work.)


It fails for > 4GB.
The msvcrt.dll!fwrite(buff, (4G+0x1000), 1, file)  works as following:
(based on assembly, not the original source, irrelevant parts have been omitted) 

size_t fwrite(const void* buff, size_t element_size, size_t element_count, FILE* file_p)
{
    size_t size_t_total_size = element_size * element_count;
    size_t size_t_remain_size = size_t_total_size;

    unsigned int u32_written_bytes;
    unsigned int buf_size;

    /* The register used is r12d but not r12.
     * So I suspect that Microsoft wrote it as an unsigned int type
     * (or msvc compiler bug? seems unlikely)
     */
    unsigned int u32_chunk_size;

    while (true) {

        if ((file_p->flags & 0x10C) != 0) {
            buf_size = file_p->buf_size;
        }
        else {
            // Always reaches here on the first fwrite() after fopen().
            buf_size = 4096;  // mov     r15d, 1000h
        }

        if (size_t_remain_size > buf_size) {

            u32_chunk_size = size_t_remain_size;

            if (buf_size) {

                // div ... ;  sub r12d,edx;   size_t stored into r12d , lost high 32 bits
                // u32_chunk_size = 0x100000FFF - 0x100000FFF % 0x1000 
                //                            = 0x100000FFF - 0xFFF 
                //                            = 0x1 0000 0000 
                //                            = (u32) 0x1 0000 0000
                //                            = 0
                u32_chunk_size = size_t_remain_size - (size_t_remain_size % buf_size);
            }

            //call _write() with zero size, returns 0
            u32_written_bytes = __write(file_p, data_buff, u32_chunk_size);

            // They didn't check if __write() returns 0.
            if (u32_written_bytes == -1 || u32_written_bytes < u32_chunk_size) {
                return (size_t_total_size - size_t_remain_size) / element_size;
            }

            //size_t_remain_size -= 0
            size_t_remain_size -= u32_written_bytes;
            buff += u32_written_bytes;

            //size_t_remain_size will never decrease to zero, then while(true) loop forever.
            if (size_t_remain_size == 0) {
                return element_count;
            }
            // ...
        }
        else {
            // ...
        }
    }
    // ...
}

note:
1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll( mingw64 links to it );
2. fwrite implementation in another msvc library which is called ucrtbase.dll is correct(msvc links to it by default).

 
Is there a particular reason to use 128MB here? If the
runtime only fails on 4GB or more, maybe we should use
a larger MAX_CHUNK_SIZE, like 2GB ?

According to current analysis, size <= 4GB all are safe, however there are many 
versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows 11(all 
with full latest updates), and it may also exist in other versions, but it is difficult to 
check each version individually. I am not sure if all versions handle boundary sizes
like 2GB/4GB correctly. So I prefer a relatively conservative value: 128MB.

Maybe we could use #ifdef _WIN32 to differentiate the handling between Linux and
Windows. For Linux, it remains unchanged, while for Windows, it processes by chunks
with max_chunk_sizeto 1GB.

 
> +        while (offset < b->size) {
> +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
> +                         ? MAX_CHUNK_SIZE
> +                         : (b->size - offset);

You can write this as
     chunk_size = MIN(b->size - offset, MAX_CHUNK_SIZE);
which I think is clearer. (Our osdep header provides MIN().)

I think we should abstract out the bug workaround into a
separate function, with the same API as fwrite(). Call
it do_fwrite() or something, and make all the fwrite()
calls use it. I know at the moment there's only two of
them, and one of them is the header so never 4GB, but
I think this more cleanly separates out the "work around
a runtime library problem" part from the main logic of
the program, and will mean that if we ever need to rearrange
how we write out the data in future it will be simple.


Thanks, you are right!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]