libtool-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.


From: Peter Rosin
Subject: Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.
Date: Sat, 12 Mar 2016 00:13:45 +0100
User-agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 2016-03-11 22:22, Michael Haubenwallner wrote:
> Hi Peter,
> 
> thanks for looking at the patch!
> 
> On 03/10/2016 12:29 PM, Peter Rosin wrote:
>> Hi Michael,
>>
>> I had a look since I wrote a patch for POSIX nm a couple of years ago
>> that I never submitted (I didn't see any use case) which looked very
>> similar, excepting the AIX-ism in your version.
>>
>> On 2016-03-10 10:01, Michael Haubenwallner wrote:
>>> * m4/libtool.m4 (LT_PATH_NM): Detect POSIX-compatible nm for AIX.  In
>>> BSD mode, the AIX nm does not tell whether a symbol is weak, need to use
>>> POSIX mode instead.
>>> (_LT_CMD_GLOBAL_SYMBOLS): Support POSIX-compatible nm.  Reorder to allow
>>> for platform specific hooks during transformation of global_symbol_pipe
>>> into C source code.  For AIX, set hook to transform even weak text
>>> symbols as text symbols.
>>> (_LT_LINKER_SHLIBS): Use global_symbol_pipe to simplify forming the
>>> export_symbols_cmds for AIX.
>>> ---
>>>  m4/libtool.m4 | 101 
>>> ++++++++++++++++++++++++++++++++--------------------------
>>>  1 file changed, 55 insertions(+), 46 deletions(-)
>>>
>>> diff --git a/m4/libtool.m4 b/m4/libtool.m4
>>> index 2c0e657..6134522 100644
>>> --- a/m4/libtool.m4
>>> +++ b/m4/libtool.m4
>>> @@ -3755,10 +3755,10 @@ _LT_DECL([], [want_nocaseglob], [1],
>>>  
>>>  # LT_PATH_NM
>>>  # ----------
>>> -# find the pathname to a BSD- or MS-compatible name lister
>>> +# find the pathname to a BSD-, POSIX- or MS-compatible name lister
>>>  AC_DEFUN([LT_PATH_NM],
>>>  [AC_REQUIRE([AC_PROG_CC])dnl
>>> -AC_CACHE_CHECK([for BSD- or MS-compatible name lister (nm)], lt_cv_path_NM,
>>> +AC_CACHE_CHECK([for BSD-, POSIX- or MS-compatible name lister (nm)], 
>>> lt_cv_path_NM,
>>>  [if test -n "$NM"; then
>>>    # Let the user override the test.
>>>    lt_cv_path_NM=$NM
>>> @@ -3808,6 +3808,26 @@ else
>>>    : ${lt_cv_path_NM=no}
>>>  fi])
>>>  if test no != "$lt_cv_path_NM"; then
>>> +  case $host_os in
>>> +  aix[[4-9]]*)
>>> +    # With AIX nm we need the '-l' flag to get the "weak" information
>>> +    # for the Import File, but '-l' is ignored with the '-B' flag.  So
>>> +    # we use the '-P' (POSIX) flag instead.  As users often provide the
>>> +    # '-B' flag, which conflicts with '-P', we drop any provided flag.
>>> +    # AIX nm needs the '-C' flag to disable demangling.  For both GNU
>>> +    # and AIX nm, the '-g' flag shows public (global) symbols only,
>>> +    # and the '-p' flag disables sorting to improve performance.
>>> +    set dummy $lt_cv_path_NM
>>> +    case address@hidden|@2 -V 2>&1` in
>>> +    *GNU* | *'with BFD'*)
>>> +      lt_cv_path_NM="@S|@2 -Bgp"
>>> +      ;;
>>> +    *)
>>> +      lt_cv_path_NM="@S|@2 -PlCgp"
>>> +      ;;
>>> +    esac
>>> +    ;;
>>> +  esac
>>
>> You are overriding the user provided $NM. Not good. If a user says
>> NM="nm --this-will-not-work", then you will have to trust that even if
>> it is not likely to work. User error, so what? Adding -Bgp or -PlCgp
>> can only be done when the user has not specified $NM.
> 
> Agreed. I've added a check whether NM will mark weak symbols instead.

I was thinking that you needed to try various flags for each nm in the
mentioned loop until you find a good nm/flags combo, and keep looking if you
think you might find an even better combo later (i.e. what is there today,
where a BSD nm is preferred over other name listers, but tweaked to suite
AIX which seemingly prefers posix nm above all else).

Then, when you have found an nm/flags combo (or if the user has provided
it), and this part was already ok in the patch, you make libtool detect if
the $NM interface is posix, bsd, MS dumpbin or ..., and build the symbol
pipe accordingly.

>> Yes, I see that
>> AIX has previously added nm flags behind the back of the user, but there
>> is no reason to continue with that now that you are changing things.
>>
>> You need to modify innards of the lt_tmp_nm loop in the else branch
>> a few lines up (just above the context).
>>
>>>    NM=$lt_cv_path_NM
>>>  else
>>>    # Didn't find any BSD compatible name lister, look for dumpbin.
>>> @@ -3832,7 +3852,7 @@ fi
>>>  test -z "$NM" && NM=nm
>>>  _LT_SET_TOOL_ABI_FLAG([NM])
>>>  AC_SUBST([NM])
>>> -_LT_DECL([], [NM], [1], [A BSD- or MS-compatible name lister])dnl
>>> +_LT_DECL([], [NM], [1], [A BSD-, POSIX- or MS-compatible name lister])dnl
>>>  
>>>  AC_CACHE_CHECK([the name lister ($NM) interface], [lt_cv_nm_interface],
>>>    [lt_cv_nm_interface="BSD nm"
>>> @@ -3847,6 +3867,8 @@ AC_CACHE_CHECK([the name lister ($NM) interface], 
>>> [lt_cv_nm_interface],
>>>    cat conftest.out >&AS_MESSAGE_LOG_FD
>>>    if $GREP 'External.*some_variable' conftest.out > /dev/null; then
>>>      lt_cv_nm_interface="MS dumpbin"
>>> +  elif $GREP '^[[   ]]*_*some_variable' conftest.out > /dev/null; then
>>> +    lt_cv_nm_interface="POSIX nm"
>>
>> Isn't this a pretty weak check, perhaps append ' B' and remove the 
>> possibility
>> for leading whitespace? (see my last comment below for reasoning on spaces)
> 
> As long as the expected symbol name comes first, isn't it POSIX then?
> Anyway, 've added "[\t ][\t ]*[A-Za-z]" now, as $symcode is defined later.
> And there is no check for BSD style after all.

Since it is POSIX output, my point is that it should be fairly safe to assume
B as the symbol type, maybe it could be a D if the tools do not put zero-vars
in bss, but why wouldn't they? So, perhaps [BD] is a more palatable pattern?
I simply don't think you need to match every possible symbol type. Do you?

>>
>>>    fi
>>>    rm -f conftest*])
>>>  ])# LT_PATH_NM
>>> @@ -4012,8 +4034,33 @@ symcode='[[BCDEGRST]]'
>>>  # Regexp to match symbols that can be accessed directly from C.
>>>  sympat='\([[_A-Za-z]][[_A-Za-z0-9]]*\)'
>>>  
>>> +if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>> +  # Gets list of data symbols to import.
>>> +  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>> +  # Adjust the below global symbol transforms to fixup imported variables.
>>> +  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>> \1;/p'"
>>> +  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>> +  lt_c_name_lib_hook="\
>>> +  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>> +  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>> +else
>>> +  # Disable hooks by default.
>>> +  lt_cv_sys_global_symbol_to_import=
>>> +  lt_cdecl_hook=
>>> +  lt_c_name_hook=
>>> +  lt_c_name_lib_hook=
>>> +fi
>>> +
>>>  # Define system-specific variables.
>>>  case $host_os in
>>> +aix[[4-9]]*)
>>> +  case `$NM -V 2>&1` in
>>> +  *GNU* | *'with BFD'*) ;;
>>> +  *)
>>> +    symcode='[[BDLTVWZ]]'
>>> +    lt_cdecl_hook=" -e 's/^W/T/p'" # weak text symbol
>>> +  esac
>>> +  ;;
>>
>> Why does AIX need to export weak symbols, when W symbols are not
>> handled in the nm output on other systems? This seems inconsistent?
> 
> Erm, with GNU nm, $symcode actually does contain W. And a weak symbol
> is referenced as variable in the lt_*_LTX_preloaded_symbols array,
> even if it might actually be a text symbol... What do I miss here?

It probably me missing something, like looking at the default symcodes
instead of the GNU nm symcodes. My bad.

> Why there is need for the weakness information: The aix-soname=svr4
> feature uses Import Files to provide filename-based shared library
> versioning, so a subsequent linker does actually link against a text
> file rather than some binary shared object. And the Import File allows
> to specify the weak keyword, while it is ignored in an Export File.
> So the content of the Export File used to create a shared library is
> provided as the Import File needed to link against that shared library.

Ok, I clearly don't know this area, I was just asking because I thought
I saw an inconsistency. I guess it is ok on other systems, and if not I
guess that is not really your responsibility. Sorry for the noise...

>>
>>>  aix*)
>>>    symcode='[[BCDT]]'
>>>    ;;
>>> @@ -4054,23 +4101,6 @@ case `$NM -V 2>&1` in
>>>    symcode='[[ABCDGIRSTW]]' ;;
>>>  esac
>>>  
>>> -if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>> -  # Gets list of data symbols to import.
>>> -  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>> -  # Adjust the below global symbol transforms to fixup imported variables.
>>> -  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>> \1;/p'"
>>> -  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>> -  lt_c_name_lib_hook="\
>>> -  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>> -  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>> -else
>>> -  # Disable hooks by default.
>>> -  lt_cv_sys_global_symbol_to_import=
>>> -  lt_cdecl_hook=
>>> -  lt_c_name_hook=
>>> -  lt_c_name_lib_hook=
>>> -fi
>>> -
>>>  # Transform an extracted symbol line into a proper C declaration.
>>>  # Some systems (esp. on ia64) link data and code symbols differently,
>>>  # so use this general approach.
>>> @@ -4128,6 +4158,9 @@ for ac_symprfx in "" "_"; do
>>>  "     s[1]~/address@hidden/{print f,s[1],s[1]; next};"\
>>>  "     s[1]~prfx {split(s[1],t,\"@\"); print 
>>> f,t[1],substr(t[1],length(prfx))}"\
>>>  "     ' prfx=^$ac_symprfx]"
>>> +  elif test "$lt_cv_nm_interface" = "POSIX nm"; then
>>> +    symxfrm="\\2 $ac_symprfx\\1 \\1"
>>> +    lt_cv_sys_global_symbol_pipe="sed -n -e 's/^[[  
>>> ]]*$ac_symprfx$sympat[[         ]][[    ]]*\($symcode$symcode*\)[[      
>>> ]][[    ]]*.*$opt_cr$/$symxfrm/p'"
>>
>> Do you really need to handle leading and multiple whitespace here?
>> Posix, at least as seen here
>>   http://pubs.opengroup.org/onlinepubs/009696699/utilities/nm.html
>> seems quite clear on no leading space and one space only as separator.
> 
> Must admit that I haven't looked at the specs - and except for leading
> ones, AIX nm does write multiple whitespaces between the fields.

Eric cleared that up, I was wrong. Sorry for the noise.

>>>    else
>>>      lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[[        
>>> ]]\($symcode$symcode*\)[[       ]][[    
>>> ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'"
>>>    fi
>>> @@ -5009,19 +5042,7 @@ m4_if([$1], [CXX], [
>>>    _LT_TAGVAR(exclude_expsyms, 
>>> $1)=['_GLOBAL_OFFSET_TABLE_|_GLOBAL__F[ID]_.*']
>>>    case $host_os in
>>>    aix[[4-9]]*)export_symbols_cmds
>>> -    # If we're using GNU nm, then we don't want the "-C" option.
>>> -    # -C means demangle to GNU nm, but means don't demangle to AIX nm.
>>> -    # Without the "-l" option, or with the "-B" option, AIX nm treats
>>> -    # weak defined symbols like other global defined symbols, whereas
>>> -    # GNU nm marks them as "W".
>>> -    # While the 'weak' keyword is ignored in the Export File, we need
>>> -    # it in the Import File for the 'aix-soname' feature, so we have
>>> -    # to replace the "-B" option with "-P" for AIX nm.
>>> -    if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then
>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience 
>>> | awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 
>>> == "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " 
>>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols'
>>> -    else
>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e 
>>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if 
>>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 2 
>>> == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != ".")) 
>>> { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 " weak" 
>>> } else { print \$ 1 } } }'\'' | sort -u > $export_symbols'
>>> -    fi
>>> +    _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | 
>>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" 
>>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > 
>>> $export_symbols'
> 
> On a side note:
> As the C++ value is identical to the C one for various platforms,
> wouldn't it work for them to do something like:
>   _LT_TAGVAR(export_symbols_cmds, $1)=$_LT_TAGVAR(export_symbols_cmds)
>   _LT_TAGVAR(exclude_expsyms, $1)=$_LT_TAGVAR(exclude_expsyms)

Would you not need to change tag between the get and the set for that
to work as I think you intend? What am I missing?

>>>      ;;
>>>    pw32*)
>>>      _LT_TAGVAR(export_symbols_cmds, $1)=$ltdll_cmds
>>> @@ -5464,19 +5485,7 @@ _LT_EOF
>>>     exp_sym_flag='-Bexport'
>>>     no_entry_flag=
>>>        else
>>> -   # If we're using GNU nm, then we don't want the "-C" option.
>>> -   # -C means demangle to GNU nm, but means don't demangle to AIX nm.
>>> -   # Without the "-l" option, or with the "-B" option, AIX nm treats
>>> -   # weak defined symbols like other global defined symbols, whereas
>>> -   # GNU nm marks them as "W".
>>> -   # While the 'weak' keyword is ignored in the Export File, we need
>>> -   # it in the Import File for the 'aix-soname' feature, so we have
>>> -   # to replace the "-B" option with "-P" for AIX nm.
>>> -   if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then
>>> -     _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience | 
>>> awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == 
>>> "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " 
>>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols'
>>> -   else
>>> -     _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e 
>>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if 
>>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 2 
>>> == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != ".")) 
>>> { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 " weak" 
>>> } else { print \$ 1 } } }'\'' | sort -u > $export_symbols'
>>> -   fi
>>> +   _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | 
>>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" 
>>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > 
>>> $export_symbols'
> 
> The main motivation here is this simplification after all,
> as this needs another symbol exclusion (patch 4/4), which
> does make sense for the preloaded symbols list as well.

Yes, it is much cleaner to adjust the symbol pipe according to $NM, than
trying to "fix" $NM by adding flags. That part is nice indeed!

Cheers,
Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]