[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Target native layer
From: |
Dr. Torsten Rupp |
Subject: |
Target native layer |
Date: |
Tue, 10 Aug 2004 13:30:09 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624 |
Dear Classpath-developers,
I read many emails in the list about the TARGET_*-layer. I'm not very
happy about the discussion, because we discussed this already around 1
year ago and at that time it seemed everybody was happy to get the
abstraction layer TARGET_*. And of course I'm not happy about it,
because there is a discussion to remove it completely without imho
understanding the idea behind the layer. After exchanging some private
emails I like to post some of my thoughts in public.
1. Advantages and disadvantages
Advantages of TARGET_*:
- efficient code (independent of compiler)
- easy to port (just override the macros which are some
special case for the target compared to the generic
implementation)
- macros are usually small (only 3-5 lines)
- functions can also used (e. g. for complex native code)
- there is no "dead-code"
- no need for extensive ifdef-elif-else-endif constructs
- target dependent implementation is located in a single file
(header-file with macros)
Disadvantages of TARGET_*
- debugging is more complicated
- not type-save
Advantages of target-layer functions like do_*()
- debugging is easier
- type-safe like other C-code
Disadvantages of target-layer functions do_*()
- less efficient (additional function-call)
- "dead-code" if some function in a object-file is not used
(problem with linker; see comments below)
- code cluttered with ifdef-elif-else-endif constructs
- running autoconf for embedded systems is difficult, thus
many "hard-wired" predefines are needed
- target dependent implementation are located in several
files (configure, header-file with predefines, c-file)
2. Naming and other "cosmetic" things
The naming convention of the TARGET_* is like the following:
TARGET_NATIVE_<module>_<function>
<module> stands for some group of functions, e. g. file-functions.
<function> stands for the name of a function, usually the name of the
corresponding OS-function from Linux extended by some suffix, e. g.
OPEN_READ which stands for open(...,RD_READONLY). There is (or at
least: there should be) no exception, thus some names could become
a little bit long. But there are no "arbitrary" abbreviations
which are difficult to understand.
For the math-macros the naming is also like this, e. g.
TARGET_NATIVE_MATH_FLOAT_DOUBLE_ISNAN
module: MATH_FLOAT (floating point macros)
function: DOUBLE_ISNAN (check if a double is NaN)
The prefix TARGET_NATIVE_ is always used to avoid naming conflicts with
existing names in the OS-includes. E. g. using OPEN_READ() only would be
dangerous, because OPEN_READ could be some already existing constant,
macro or function in the specific target OS.
OF COURSE.... this naming is _not_ fixed and I do _not_ assume it is the
best solution at all (but it is some solution; before that we had many
conflicts with different OSs, because of the so "convenient" short
names). If needed this can be changed without to much confusion and pain
(it would be some pain of course for aicas, but this would be acceptable).
Cosmetics are:
- length of names for macros (usually I never typeing a macro name,
instead I use cut+copy; emacs-users can use auto-completion,
Eclipse-users also have some help be the IDE). By the way: the
longest name is currently 55 characters long.
- prefix TARGET_NATIVE: some other would also be good
- length of lines
3. complexity of macros, debugging
It is true that #defines are difficult to debug. They are also difficult
to write, but of course it depends always on the specific macro. Usually
the macros are of the form like this
#define TARGET_NATIVE_<name>(...) \
INCLUDE
do { \
FUNCTION
RESULT
} while (0)
e. g.:
#define
TARGET_NATIVE_FILE_OPEN(filename,filedescriptor,flags,permissions,result) \
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
do { \
filedescriptor=open(filename, \
flags, \
permissions \
); \
result=(filedescriptor>=0)?TARGET_NATIVE_OK:TARGET_NATIVE_ERROR; \
} while (0)
The standard implementation (generic) contain 138 of these macros. Only
9 are more complex, because of different possible implementations
(selected by autoconf) or transformation of values. Thus in most of the
cases the macros are only "wrappers" for some OS-specific call including
adaption of parameters (e. g. types or units, result value).
Imho the complexity of the macros is usually not very high (if it
becomes complex also a function can be implemented; then a macro is only
an "alias"). They are multi-lined to make them more readable. The
"include"-statements are needed in the generic implementation and the
"do...while" is a construct for safe usage of the macro.
Debugging of macros is difficult - if they are complex. If a macro is
only a wrapper then only a OS-specific function is called with some
additional calculations, e. g. evaluation of the return value. It is a
good idea to keep the macros as simple as possible. And it is possible,
because the TARGET_*-layer does not add additional functionally, it only
"maps" some functionality.
3. autoconf, POSIX - porting
autoconf is a nice tool which is also used at aicas heavily (half of my
time I'm doing "autoconf"). But autoconf is also limited in its usage.
For Unix-like systems autoconf is a good solution (for this autoconf is
written, I assume), but for non-Unix-like systems autoconf could be
become a problem. We discussed at aicas around 1 year ago if we should
use autoconf only, but we detected this is not really possible.
Especially for embedded systems and "strange" systems (e. g. MinGW or
embOS) it is a big challenge to "trim" autoconf in such a way that the
right configuration is selected. I detected some complicacy of autoconf
and the specific OS which makes it very difficult to use autoconf only.
I will give you a few examples:
- for embeeded system it is not always feasible to check if a function
exists by compiling and linking a small example-program, because
sometimes linkage is done only partially on the host. Final linkage is
done on the target when loading the program or when creating the system
image with the included application. Thus AC_CHECK_FUNC is not feasible.
The same problem occurs for other checks, e. g. constants or datatypes.
- some systems have very strange and even wrong header-files. E. g. for
Windows/MinGW the headers sys/stat.h, io.h, windows.h, winbase.h are
needed for checking chsize() (truncate) or mkdir(). For embOS even some
header-files can not be included, because they are wrong (but cannot be
changed/fixed by aicas). These things make autoconf very complicate to
use, because if there are a lot of possible functions which can be used
to implement some feature, it is not clear which function is detected by
autoconf for some specific system. There even could be very bad
side-effects if more than one function is available (e. g. f1() and
f2()) and at some time f2() is used instead f1() (with different
behaviors or limitations), if there is some change for another target
system (e. g. you add some changes for RTEMS, but this will also have
effects on e. g. embOS. You will not detect this problem until testing
again all targets for any change in autoconf). It is a little bit
"undeterministic" which features are detected and if they are usable.
- some features are not detectable by autoconf, e. g. the ordering of
parameters for functions like inb() and outb() (we had that problem) or
additional parameters (which usually only produce a warning which is
discarded), e. g. gethostbyname_r() under Solaris.
There are much more difficult things which occur with autoconf. To
replace some target layer (e. g. TARGET_*) by autoconf only will imho
make implementations for non-Unix-like systems very difficult and will
only shift the so called "complex" C-macro-implementation into "complex"
autoconf-macros-implementations (imho M4 is not much better then a
C-preprocessor and difficult to debug).
4. Multiple code - some statistics:
In the current implementation we use at aicas we have the following
systems. The numbers below count the number of macros at all (functions
and constants) which are different from the standard (generic)
implementation:
generic macros: 220
Linux: 0
Solaris: 11
RTEMS: 3
MinGW: 46
embOS: 10 (only partially implemented)
There are 0 (Linux) upto 46 (MinGW) special cases of macros. Some have
to be implemented because of different OS functions, some are
implemented for efficiency (e. g. some OSes offer a POSIX-thread
interface, but the native thread-interface is usually more efficient. It
can be used with the macros-technique without any overhead). Thus for
some targets 0..20% of the macros have to be reimplemented to cover
special causes. For "Unix"-like systems this is only usually less then 5%.
Some additional comments:
Efficient code: wrapper-functions are nice, but in some cases an
overkill, e. g. when calling simple function like sin(). In general C
compilers do not optimize this (imho that is some reason why "inline"
was introduced). If "inline" can be used, macros are almost not needed
anymore.
autoconf: autoconf is a good idea and aicas is also heavily using it,
but there are some limitations especially for embedded systems: because
autoconf can not run test-programs on embedded target-systems, some test
can not be done with autoconf (see above). I had to replace many
autoconf-test-functions by special versions which can be used for
embedded systems. And still there are many things which are difficult to
handle, e. g. which include files have to be included for some
native-function. Some target systems make it really hard to use autoconf
in the right way.
dead-code: the standard GNU linker does not remove functions which are
not used (dead-code). Thus if at least one function is needed from a
object file, all other function from that object file are linked to the
application, too. There is only one automatic way to remove dead-code
(-ffunction-sections), but this have other disadvantages. Even the
man-page do not recommend it. Thus to remove dead-code of a function
some #ifdef-endif around a function is needed. On the other side: A
non-used macro will not produce any dead code.
Some personal view:
By the way: I like "long" names and I hate uncommon abbreviations e. g.
like "fnctn" instead "function". I also like some prefix which indicate
the location of, e. g. "file_open()" instead "open()". I usually have no
problems with long names if the naming is consistent and useful. I also
have no problems with lines longer then 80 characters, because my editor
does not have some "optimal" line length.
These are my thoughts to this topic. I hope all developers who are
interrested in some target native layer will reconsider the current
discussion. And I hope we will find a solution which can satisfy everybody.
Sincerely,
Torsten
- Target native layer,
Dr. Torsten Rupp <=
- Re: Target native layer, Mark Wielaard, 2004/08/10
- Re: Target native layer, Roman Kennke, 2004/08/10
- Re: Target native layer, Dr. Torsten Rupp, 2004/08/11
- Re: Target native layer, Michael Koch, 2004/08/11
- Re: Target native layer, Dr. Torsten Rupp, 2004/08/12
- Re: Target native layer, Michael Koch, 2004/08/12
- Re: Target native layer, Dr. Torsten Rupp, 2004/08/12
- Re: Target native layer, Michael Koch, 2004/08/12