avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] User-manual/optimization.html


From: Georg-Johann Lay
Subject: Re: [avr-libc-dev] User-manual/optimization.html
Date: Fri, 19 Jun 2015 17:13:51 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

Am 06/18/2015 um 02:58 PM schrieb David Brown:
Hi,

In the user manual:

<http://www.nongnu.org/avr-libc/user-manual/optimization.html>

there is a discussion about the unexpected code generation from:

#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )
unsigned int ivar;
void test2( unsigned int val )
{
        val = 65535U / val;
        cli();
        ivar = val;
        sei();
}

This came up recently in a gcc-help mailing list question - the problem
is that the call to __udivmodhi4 may be generated after the cli
instruction, disabling interrupts for longer than necessary.  The web
page says there is no way to force the desired code generation (with
"val" being calculated before "cli").
If my recollection is right -fno-tree-ter was a fix as the code motion was 
performed by respective pass.
Some technical background:  The avr back-end pretends it implements integer 
division and remainder by providing respective insns, hence the middle-end 
assumes that the division can be performed with a few instructions.
Rationale is that avr-libgcc has many hand-written and -optimized assembler 
routines, and many of these routines have a smaller register footprint than 
required by the ABI.  avr-gcc uses this information to implement respective 
features (like div) as a transparent library call together with clobbering all 
destroyed registers and providing arguments to respective registers by hand.
This results in much smaller code, and many functions become leaf functions. 
Without that approach any function using a feature as basic as integer 
multiplication would generate "proper" library calls similar to ordinary functions.
If division was a library call it wouldn't be moved across the memory clobber, 
but the result would considerably increase in code size.

However, there /is/ a way to get the right results - using a fake
assembly input to force the calculation:

#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )
unsigned int ivar;
void test2( unsigned int val )
{
     val = 65535U / val;
     asm volatile("" :: "" (val));
     cli();
     ivar = val;
     sei();
}

The memory clobber on cli() and sei() ensures that no memory operations
are moved before or after those statements.  But as already noted, the
memory clobber does not affect non-memory operations such as
calculations or register-only manipulation.
The problem is that one has to know respective dependencies which is usually 
not the case.  Just consider the case where the cli() is part of an inlined 
function and the division or multiplication is performed by the caller.  or the 
multiplication is part of an address computation like in  val = 
list->next->next->next->val.

My recommendation is to try -fno-tree-ter before cluttering up code with ugly patterns.
Johann




reply via email to

[Prev in Thread] Current Thread [Next in Thread]