avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] split movhi to 2 movqi


From: Georg-Johann Lay
Subject: Re: [avr-gcc-list] split movhi to 2 movqi
Date: Fri, 20 May 2011 15:39:43 +0200
User-agent: Thunderbird 2.0.0.24 (X11/20100302)

Ilya Lesokhin schrieb:
> On Thu, May 19, 2011 at 6:43 PM, Georg-Johann Lay <address@hidden> wrote:
> 
>> Ilya Lesokhin schrieb:
>>
>> Hi,
>>> I would like to split a movhi instruction from an immediate to a const
>>> address to 2 movqi instructions.
>>> (I'm using gcc 4.5.3 but i dont think it matters in this case)
>>>
>> The current development version is 4.7.0.
>>
>> It won't help to provida a patch against an old version. So you need an
>> up-to-date version anyway. This includes also the testsuite. Patches must
>> pass the testsuite without regressions.
>>
>>
> 
> I'm doing it for myself at the moment, i dont think i've reached the stage i
> can contribute to the mainline branch yet, So i dont really care what the
> development branch is.
> 
> 
>> My motivation is the following:
>>> Code in the form:
>>> OCR2RA = 1;
>>>
>>> compiles to:
>>>
>>> ldi r24, 0x01 ; 1
>>> ldi r25, 0x00 ; 0
>>> sts 0x00F7, r25
>>> sts 0x00F6, r24
>>>
>>> instead of
>>>
> 
> ldi r24, 0x01 ; 1
> sts 0x00F7, __zero_reg__
> sts 0x00F6, r24
> 
> 
> 
>>  I'm pretty sure its caused by the following code:
>>> (define_expand "movhi"
>>> [(set (match_operand:HI 0 "nonimmediate_operand" "")
>>> (match_operand:HI 1 "general_operand" ""))]
>>> ""
>>> "
>>> {
>>>  /* One of the ops has to be in a register. */
>>> if (!register_operand(operand0, HImode)
>>> && !(register_operand(operand1, HImode) || const0_rtx == operands[1]))
>>> {
>>> operands[1] = copy_to_mode_reg(HImode, operand1);
>>> }
>>> }")
>>>
>>> i would like to fix it but i know very little about gcc internals.
>>>
>>> i tried to do something like:
>>>
>>> (define_expand "movhi"
>>> [(set (match_operand:HI 0 "nonimmediate_operand" "")
>>> (match_operand:HI 1 "general_operand" ""))]
>>> ""
>>> "
>>> {
>>> rtx base = XEXP (operands[0], 0);
>>> if (GET_CODE (operands[1]) == CONST_INT
>>> && CONSTANT_ADDRESS_P (base))
>>> {
>>> short value = INTVAL (operands[1]);
>>> short address = INTVAL (base);
>>>
>> base is a constant, but not necessarily a CONST_INT, it can also be a
>> SYMBOL_REF or a CONST. If you add support for storing zero-extended values,
>> I see no reason why to excluse some flavours of address.
> 
>> emit_move_insn(gen_rtx_MEM(QImode,address+1), GEN_INT(value >> 8));
>>> emit_move_insn(gen_rtx_MEM(QImode,address), GEN_INT(value & 0xff));
>>>
>> address+1 is wrong, because address in general is not a CONST_INT
>> Maybe plus_constant is what you looking for.
>>
> 
> OK, thank you for pointing those things out.
> 
>> Moreover, there is no point in expanding some stuff if there is no insn
>> that will handle it.
>>
>>
> it seems its handles correcly by movqi and other relevet patterns.

That's because splitting a volatile mem is not appropriate. You would
have to volatile QI accesses instead of one volatile HI access. This
means the two parts could move far away from each other. GCC is
conservative about volatiles, so should be a backend.

>> DONE;
>>> }
>>> else /* One of the ops has to be in a register. */
>>> if (!register_operand(operand0, HImode)
>>> && !(register_operand(operand1, HImode) || const0_rtx == operands[1]))
>>> {
>>> operands[1] = copy_to_mode_reg(HImode, operand1);
>>> }
>>> }")
>>>
>>> but then i get Segmentation fault when trying to compile code.
>>>
>> You move expansion cannot work because you need at least a QI register
>> (except in the case when you like to store 0, which is already handled in
>> the implementation.
>>
>> Presumably, you want code like this:
>>
>> (set (reg:QI n)
>>     (const_int m))
>> (set (mem:HI (const_addr:P))
>>     (zero_extend:HI (reg:QI n)))
>>
>> You can do it in expand, but note that expand is called at different stages
>> of compilation. You need an intermediate register, so you must be sure you
>> can actually get one, i.e. can_create_pseudo_p is true.
>>
>> As an alternative, you can hook in after insn combine.
>> combine will synthesize such insns, so you can supply a combiner pattern
>> that matches and split it prior to register allocation by means of a split
>> pattern.
>>
>> Yet an other alternative is to extend zero_extendqihi to accept memory
>> operand in dest. Note that there is different handlich for volatiles.
> 
> 
> i belive that spliting the instruction into 2 instractions is supperior

No, splitting a volatile access is not appropriate.

> since its also hadles the cases where the low byte is zero or when the low
> byte and the high byte are equal as suppose to zero_extend.

I don't think it's a point trading marginal efficiency against
volatile correctness.

So an implementation will look like this:

In movhi expander:

if (optimize
    && can_create_pseudo_p()
    && MEM_P (operands[0])
    && satisfies_constraint_M (operands[1])
    && const0_rtx != operands[1])
  {
    rtx val = GEN_INT (trunc_int_for_mode (INTVAL (operands[1], QImode));
    rtx reg = copy_to_mode_reg (QImode, val);
    emit_insn (gen_store_zeroextendqihi2 (operands[0], reg);
    DONE;
  }

And

(define_insn "store_zeroextendqihi2"
  [(set (match_operand:HI 0 "memory_operand" "=Qm")
        (zero_extend:HI (match_operand:QI 1 "register_operand" "r")))]
  ""
  {
     return out_movhi_mr_zerox_r (insn, operands, NULL);
  })

where out_movhi_mr_zerox_r is similar to out_movhi_mr_r but adapted
for your purposes.

Moreover, you will have to fix length computation in adjust_insn_length.

This approach is appropriate for volatile mems, too. And it does not
make restrictions to the address involved. Why exclude indirect
addressing...?

> also what do you mean by handlich for volatiles?
> 
> 
> anyway, thanks for all your help, you've been  the most helpful so far.
> 
> unfortently, the task seems alot more complicated the i first thought.

Yes, that's GCC. Don't say you didn't know before ;-)

> OCR0SA = 1u;
> is at first converted to and indirect memory acces:
> ...
> (const_int 1 [0x1])) -1 (nil))
> (set (mem/v:HI (reg/f:HI 43) [2 S2 A8])
> (reg:HI 44)) -1 (nil))
> 
> and only after the IRA phase its converted to something with a const address
> ...
> (set (mem/v:HI (const_int 210 [0xd2]) [2 S2 A8])
>   (reg:HI 24 r24 [44])) 10 {*movhi} (expr_list:REG_EQUAL (const_int 1
> [0x1] (nil)))

It's a (post)reload optimization.

> which is the form that i tried to fix. so i think ill have to try
> a different approch.

I already explained why doing it post-reload is inferior and rather a
last-resort hack.

Johann



reply via email to

[Prev in Thread] Current Thread [Next in Thread]