[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Consult] tilegx: About floating point instructions
From: |
Chen Gang |
Subject: |
Re: [Qemu-devel] [Consult] tilegx: About floating point instructions |
Date: |
Sat, 15 Aug 2015 17:56:07 +0800 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 |
On 8/13/15 22:59, Chen Gang wrote:
> Hello all:
>
> For me, I guess for single insns, they are simple, and each calculation
> insns group can not be mixed with each other. So current implementation
> should be OK.
>
> For double insns, I guess, only mul calculation can be mixed with other
> calculation groups (add/sub groups or int2float/double groups), because
> of optimization -- the mul calculation group have many insns.
>
Oh, we are unlucky, after continue gcc testsuite, add/sub floating point
insns also can be mixed together! The related C code, -save-temps, and
objdump files are in attachments (is it gcc's issue? I guess not).
So, I guess, we have to 'crack' all floating point insns, precisely, or
we can not pass gcc testsuite.
At present, for me, I shall try to fix another issues which are found by
gcc testsuite, at last 'crack' the floating point insns. I guess, I can
not finish it in this month (I shall try to finish in the next month).
Thanks.
> So the implementation is below:
>
> /*
> * Assume floating point mul operation group can mix with other groups.
> *
> * fdouble_unpack_max: ; skipped.
> *
> * fdouble_unpack_min: ; skipped.
> *
> * fdouble_add_flags: ; move calc flags to dest.
> * save calc flags.
> * save calc addsub result.
> *
> * fdouble_sub_flags: ; move calc flags to dest.
> * save calc flags.
> * save calc addsub result.
> *
> * fdouble_addsub: ; move calc addsub result to dest.
> * set "addsub result" flag.
> *
> * fdouble_mul_flags: ; move calc mul result to dest.
> *
> * fdouble_pack1: ; if addsub result set
> * && srca == saved addsub result
> * && srcb == saved calc flags
> * move srca to dest.
> * else
> * move srcb to dest.
> *
> * fdouble_pack2: ; if srcb == r63 && "addsub result" flag
> * reset "addsub result" flag.
> * else if srcb == r63
> * pack srca dest (dest is orig srcb of pack1)
> * reference from tilegx.md: float(uns)sidf2.
> * get (u)int32_t a, then (u)int32_to_float64.
> * else
> * skipped.
> */
>
>
> On 8/11/15 21:18, Chen Gang wrote:
>>
>> Oh, it seems a little complex, for a testsuite case, it lets double add
>> and double mul together! We need save more information for the correct
>> calculation in pack1.
>>
>> It is 20020314-1.exe, the related code (I guess it is correct):
>>
>> ...
>>
>> fdouble_unpack_max r10, r3, zero
>> .LVL2:
>> fdouble_unpack_max r15, r2, zero
>> fdouble_add_flags r12, r0, r1
>> mul_hu_lu r13, r15, r10
>> mul_lu_lu r16, r15, r10
>> mula_hu_lu r13, r10, r15
>> fdouble_unpack_min r11, r0, r1
>> {
>> shli r14, r13, 32
>> fdouble_unpack_max r17, r0, r1
>> }
>> {
>> mul_hu_hu r15, r15, r10
>> add r16, r16, r14
>> }
>> {
>> shrui r13, r13, 32
>> fdouble_addsub r17, r11, r12
>> }
>> {
>> cmpltu r14, r16, r14
>> fdouble_mul_flags r3, r2, r3
>> }
>> .LVL3:
>> {
>> add r13, r15, r13
>> fdouble_pack1 r12, r17, r12
>> }
>> {
>> add r13, r13, r14
>> fdouble_unpack_max r10, r0, zero
>> }
>> fdouble_pack1 r3, r13, r3
>> fdouble_pack2 r12, r17, zero
>> fdouble_pack2 r3, r13, r16
>>
>> ...
>>
>> Welcome any additional ideas, suggestions and completions.
>>
>> Thanks.
>>
>> On 8/9/15 09:14, Chen Gang wrote:
>>> On 8/9/15 09:10, Chen Gang wrote:
>>>>
>>>> On 8/9/15 01:23, Chen Gang wrote:
>>>>> Hello all:
>>>>>
>>>>> Below is my current idea for all floating point insns. For me, it is not
>>>>> the precise implementation, even not completely implement -- assume pack
>>>>> insns can only for packing (u)int32_t when they are used individually:
>>>>>
>>>>> fsingle_add1 ; return calc flags, save calc result to env.
>>>>>
>>>>> fsingle_sub1 ; return calc flags, save calc result to env.
>>>>>
>>>>> fsingle_addsub2 ; set "has result" flag.
>>>>>
>>>>> fsingle_mul1 ; skip return value, save calc result to env.
>>>>> set "has result" flag.
>>>>>
>>>>> fsingle_mul2 ; skipped.
>>>>>
>>>>>
>>>>> fsingle_pack1 ; skipped.
>>>>>
>>>>> fsingle_pack1 ; if "has result"
>>>>> reset "has result" flag.
>>>>> return calc result from env.
>>>>> else
>>>>> pack srca
>>>>> reference from tilegx.md: float(uns)sisf2.
>>>>> get (u)int32_t a, then (u)int32_to_float32.
>>>>
>>>> For "pack srca and srcb", the related demo like below (srca and srcb
>>>> are uint64_t):
>>>>
>>>
>>> Oh, sorry, for "pack srca" (not for "pack srca and srcb")
>>>
>>>> switch (srca & 0x3ff) {
>>>>
>>>> /* treat it as uint32_t */
>>>> case 0x9e:
>>>> return uint32_to_float32(srca >> 32, &FP_STATUS);
>>>>
>>>> /* treat it as int32_t, must be negative number */
>>>> case 0x29e:
>>>> return int32_to_float32(srca >> 32 | 0x80000000, &FP_STATUS);
>>>>
>>>> default:
>>>> unimplemented (gen_exception).
>>>> }
>>>>
>>>>>
>>>>> fdouble_unpack_max: ; skipped.
>>>>>
>>>>> fdouble_unpack_min: ; skipped.
>>>>>
>>>>> fdouble_add_flags: ; return calc flags, save calc result to env.
>>>>>
>>>>> fdouble_sub_flags: ; return calc flags, save calc result to env.
>>>>>
>>>>> fdouble_addsub: ; set "has result" flag.
>>>>>
>>>>> fdouble_mul_flags: ; skip return flags, save calc result to env.
>>>>> set "has result" flag.
>>>>>
>>>>> fdouble_pack1: ; if "has result"
>>>>> reset "has result" flag.
>>>>> return calc result from env.
>>>>> else
>>>>> pack srca and srcb.
>>>>> reference from tilegx.md: float(uns)sidf2.
>>>>> get (u)int32_t a, then (u)int32_to_float64.
>>>>>
>>>>
>>>> For "pack srca and srcb", the related demo like below (srca and srcb
>>>> are uint64_t):
>>>>
>>>> switch (srcb & 0xffff) {
>>>>
>>>
>>> Oh, sorry, should use 0xfffff instead of 0xffff.
>>>
>>>> /* treat it as uint32_t */
>>>> case 0x21b00:
>>>> return uint32_to_float64(srca >> 4, &FP_STATUS);
>>>>
>>>> /* treat it as int32_t, must be negative number */
>>>> case 0xa1b00:
>>>> return int32_to_float64(srca >> 4 | 0x80000000, &FP_STATUS);
>>>>
>>>> default:
>>>> unimplemented (gen_exception).
>>>> }
>>>>
>>>>> fdouble_pack2: ; skipped.
>>>>>
>>>>>
>>>>> (fsingle_add1/sub1, fdouble_add/sub_flags can be used individually,
>>>>> e.g gcc testsuit for complex number).
>>>>>
>>>>>
>>>>> Next, I shall implement the floating point insns, welcome any related
>>>>> ideas, suggestions, and completions.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> On 8/5/15 22:16, Chen Gang wrote:
>>>>>> On 8/4/15 23:04, Richard Henderson wrote:
>>>>>>> On 08/04/2015 06:56 AM, Chen Gang wrote:
>>>>>>>>
>>>>>>>> On 8/4/15 04:47, Chen Gang wrote:
>>>>>>>>> On 8/4/15 00:40, Richard Henderson wrote:
>>>>>>>>>> On 08/01/2015 02:47 AM, Chen Gang wrote:
>>>>>>>>>>> I am just adding floating point instructions (e.g. fsingle_add1),
>>>>>>>>>>> but for me, I can not find any details about them (the ISA
>>>>>>>>>>> documents only give a summary description, but not details), e.g.
>>>>>>>>>>
>>>>>>>>>> The tilegx splits the four/six cycle arithmetic into multiple
>>>>>>>>>> black-box instructions. You need only really implement one of the
>>>>>>>>>> four, with the rest of them being implemented as nops or moves.
>>>>>>>>>>
>>>>>>>>>> Looking at what gcc produces gives the hints:
>>>>>>>>>>
>>>>>>>>>> fdouble_unpack_min min, srca, srcb fdouble_unpack_max max,
>>>>>>>>>> srca,
>>>>>>>>>> srcb fdouble_add_flags flg, srca, srcb fdouble_addsub
>>>>>>>>>> max, min, flg
>>>>>>>>>> fdouble_pack1 dst, max, flg fdouble_pack2
>>>>>>>>>> dst, max, zero
>>>>>>>>>>
>>>>>>>>>> The unpack, addsub, and pack2 insns can be ignored, the add_flags
>>>>>>>>>> insn can perform the whole operation, the pack1 insn performs a move
>>>>>>>>>> from "flg" to "dst".
>>>>>>>>>>
>>>>>>>>>> Similarly for the single-precision:
>>>>>>>>>>
>>>>>>>>>> fsingle_add1 tmp, srca, srcb fsingle_addsub2 tmp,
>>>>>>>>>> srca, srcb
>>>>>>>>>> fsingle_pack1 flg, tmp fsingle_pack2 dst,
>>>>>>>>>> tmp, flg
>>>>>>>>>>
>>>>>>>>>> The add1 insn performs the whole operation, the addsub2 and pack1
>>>>>>>>>> insns are ignored, and the pack2 insn is a move from tmp to dst.
>>>>>>>>>>
>>>>>>>>
>>>>>>>> After check the tilegx.md completely, for me, we still need implement
>>>>>>>> each of them precisely, or we can not emulate all cases (e.g. muldf3).
>>>>>>>
>>>>>>> No, you can still implement all of muldf3 in fdouble_mul_flags.
>>>>>>> Again, the fdouble_pack1 copies from the flag input to the output.
>>>>>>>
>>>>>>> Yes, there is a 64-bit multiply in there, but the tcg optimizer
>>>>>>> should be able to delete all of that as unused. Especially if you have
>>>>>>> the
>>>>>>> fdouble_unpack* insns store zero into their destinations.
>>>>>>>
>>>>>>
>>>>>> For me, I am not quite sure. But I guess, what you said should be OK (at
>>>>>> least, what you said is very useful for the implementation).
>>>>>>
>>>>>>
>>>>>>> Don't get me wrong -- more accurate implementation of the actual
>>>>>>> insns would be nice, especially for debugging. But if the insns
>>>>>>> aren't accurately documented I don't see what choice we have.
>>>>>>>
>>>>>>
>>>>>> For me, I guess, we can still try to implement the details.
>>>>>>
>>>>>> - The document has all floating point instructions' summary, so we can
>>>>>> think of, or guess its implementation entirely.
>>>>>>
>>>>>> - gcc uses them all and completely, so it is our good sample and good
>>>>>> reference (but we should not assume gcc must be correct, since we
>>>>>> just use qemu for gcc testsuite).
>>>>>>
>>>>>> - Tilegx floating point format should be standard (at least, reference
>>>>>> to the standard format), so we can reference the related information
>>>>>> from google/baidu.
>>>>>>
>>>>>>
>>>>>>> On the good side, implementing the entire operation as part of the
>>>>>>> "flags" step
>>>>>>> probably results in faster emulation.
>>>>>>>
>>>>>>
>>>>>> I guess so, too.
>>>>>>
>>>>>>
>>>>>> I shall try to finish the simple implementation, firstly. Then try to
>>>>>> implement the floating point instructions in details in the future (it
>>>>>> should be lower priority).
>>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Chen Gang
Open, share, and attitude like air, water, and life which God blessed
floating-point-double-add.tar.gz
Description: GNU Zip compressed data
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, (continued)
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Richard Henderson, 2015/08/03
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/03
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/04
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Richard Henderson, 2015/08/04
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/05
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/08
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/08
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/08
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/11
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/13
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions,
Chen Gang <=
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Richard Henderson, 2015/08/15
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/15
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/15
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/15
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Richard Henderson, 2015/08/17
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/17
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Richard Henderson, 2015/08/17
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/18
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Peter Maydell, 2015/08/18
- Re: [Qemu-devel] [Consult] tilegx: About floating point instructions, Chen Gang, 2015/08/18