avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] split movhi to 2 movqi


From: Georg-Johann Lay
Subject: Re: [avr-gcc-list] split movhi to 2 movqi
Date: Fri, 20 May 2011 19:56:32 +0200
User-agent: Thunderbird 2.0.0.24 (X11/20100302)

Ilya Lesokhin schrieb:

>> That's because splitting a volatile mem is not appropriate. You would
>> have to volatile QI accesses instead of one volatile HI access. This
>> means the two parts could move far away from each other. GCC is
>> conservative about volatiles, so should be a backend.
>>
> 
> i haven't thought about that, thank you.
> i just noticed that writing:
> 
> OCR2RAH= 0;
> OCR2RAL=1;
> 
> generates better code, and i thought its better the compiler handles it,
> rather then making the code uglier.
> but as i now see it doesn't guarantee the right behavior with respect to
> volatile.
> 
> 
>>>> DONE;
>>>>> }
>>>>> else /* One of the ops has to be in a register. */
>>>>> if (!register_operand(operand0, HImode)
>>>>> && !(register_operand(operand1, HImode) || const0_rtx == operands[1]))
>>>>> {
>>>>> operands[1] = copy_to_mode_reg(HImode, operand1);
>>>>> }
>>>>> }")
>>>>>
>>>>> but then i get Segmentation fault when trying to compile code.
>>>>>
>>>> You move expansion cannot work because you need at least a QI register
>>>> (except in the case when you like to store 0, which is already handled
>> in
>>>> the implementation.
>>>>
>>>> Presumably, you want code like this:
>>>>
>>>> (set (reg:QI n)
>>>>     (const_int m))
>>>> (set (mem:HI (const_addr:P))
>>>>     (zero_extend:HI (reg:QI n)))
>>>>
>>>> You can do it in expand, but note that expand is called at different
>> stages
>>>> of compilation. You need an intermediate register, so you must be sure
>> you
>>>> can actually get one, i.e. can_create_pseudo_p is true.
>>>>
>>>> As an alternative, you can hook in after insn combine.
>>>> combine will synthesize such insns, so you can supply a combiner pattern
>>>> that matches and split it prior to register allocation by means of a
>> split
>>>> pattern.
>>>>
>>>> Yet an other alternative is to extend zero_extendqihi to accept memory
>>>> operand in dest. Note that there is different handlich for volatiles.
>>>
>>> i belive that spliting the instruction into 2 instractions is supperior
>> No, splitting a volatile access is not appropriate.
>>
>>> since its also hadles the cases where the low byte is zero or when the
>> low
>>> byte and the high byte are equal as suppose to zero_extend.
>> I don't think it's a point trading marginal efficiency against
>> volatile correctness.
>>
>> So an implementation will look like this:
>>
>> In movhi expander:
>>
>> if (optimize
>>    && can_create_pseudo_p()
>>    && MEM_P (operands[0])
>>    && satisfies_constraint_M (operands[1])
>>    && const0_rtx != operands[1])
>>  {
>>    rtx val = GEN_INT (trunc_int_for_mode (INTVAL (operands[1], QImode));
>>    rtx reg = copy_to_mode_reg (QImode, val);
>>    emit_insn (gen_store_zeroextendqihi2 (operands[0], reg);
>>    DONE;
>>  }
>>
>> And
>>
>> (define_insn "store_zeroextendqihi2"
>>  [(set (match_operand:HI 0 "memory_operand" "=Qm")
>>        (zero_extend:HI (match_operand:QI 1 "register_operand" "r")))]
>>  ""
>>  {
>>     return out_movhi_mr_zerox_r (insn, operands, NULL);
>>  })
>>
>> where out_movhi_mr_zerox_r is similar to out_movhi_mr_r but adapted
>> for your purposes.
>>
>> Moreover, you will have to fix length computation in adjust_insn_length.
>>
>> This approach is appropriate for volatile mems, too. And it does not
>> make restrictions to the address involved. Why exclude indirect
>> addressing...?
>>
> i wanted to exclude indirect addressing since splitting in that case may
> generate worse code because then the compiler wont be able to use
> post-increment/pre-decrement, but as you've kindly explained splitting is
> not the right solution :(
> 
> 
>>> also what do you mean by handlich for volatiles?
>>>
>>>
>>> anyway, thanks for all your help, you've been  the most helpful so far.
>>>
>>> unfortently, the task seems alot more complicated the i first thought.
>> Yes, that's GCC. Don't say you didn't know before ;-)
>>
>>> OCR0SA = 1u;
>>> is at first converted to and indirect memory acces:
>>> ...
>>> (const_int 1 [0x1])) -1 (nil))
>>> (set (mem/v:HI (reg/f:HI 43) [2 S2 A8])
>>> (reg:HI 44)) -1 (nil))
>>>
>>> and only after the IRA phase its converted to something with a const
>> address
>>> ...
>>> (set (mem/v:HI (const_int 210 [0xd2]) [2 S2 A8])
>>>   (reg:HI 24 r24 [44])) 10 {*movhi} (expr_list:REG_EQUAL (const_int 1
>>> [0x1] (nil)))
>> It's a (post)reload optimization.
>>
> 
> when i used -fdump-rtl-all i got the impression form the numbering of the
> files that the IRA phase is before reload, did i get the convention wrong?

There is no dump after IRA and prior to reload. With -da, the next
dump after .ira is .postreload. Note that some passes are intertwined,
so pass numbers might be confusing.

> i still want to handle the case where the low byte is zero, so i'm thinking
> its better to create a pattern that receives a HI memory operand and 2 QI
> reg operands, and the movhi expander will just use this pattern sending
> zero_reg or  copy_to_mode_reg(...) according to the value.
> 
> do you see a problem with this solution?

No. You can use unspec or make explicit what's going on. Note there is
already zero_reg_rtx in avr.c.

Johann



reply via email to

[Prev in Thread] Current Thread [Next in Thread]