texinfo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using Perl's cc


From: Eli Zaretskii
Subject: Re: Using Perl's cc
Date: Sat, 11 Jul 2015 16:10:54 +0300

> Date: Sun, 05 Jul 2015 19:20:07 +0300
> From: Eli Zaretskii <address@hidden>
> Cc: address@hidden
> 
> > Date: Sun, 5 Jul 2015 16:41:30 +0100
> > From: Gavin Smith <address@hidden>
> > Cc: address@hidden
> > 
> > On 5 July 2015 at 16:15, Eli Zaretskii <address@hidden> wrote:
> > >> Date: Sun, 5 Jul 2015 15:50:01 +0100
> > >> From: Gavin Smith <address@hidden>
> > >> Cc: address@hidden
> > >>
> > >> It would be nice to be able to fall back to using alternative
> > >> functions on other platforms as well, so I'd suggest not making it
> > >> Windows-specific.
> > >
> > > Windows provides an API for converting from UTF-8 to UTF-16, and I
> > > intended to use it.  Also, the functions that accept a wchar_t
> > > argument need to be different depending on whether wchar_t is UCS-4 or
> > > UTF-16.  I think this makes the code sufficiently different to warrant
> > > separate implementations.
> > 
> > I thought you would have to write your own iswupper etc. so using the
> > system's wchar_t doesn't gain you anything.
> 
> I probably will write some of them (iswspace, for example).  But
> coming up with my own iswupper would need huge tables, an excerpt from
> the Unicode character database, so using system APIs (and the system
> wchar_t that goes with them) is a significant gain.

Below is the patch I have come up with.  It still lacks a fully
functional wcwidth implementation.  Let me know if you think it's OK
to commit these changes in their current form, or factor out the
Windows specific code in some way.

Btw, I believe this code in xspara.c:

                      if (ptr == state.word.text + 1 || !iswspace(ptr[-2]))
                        {
                          text_append_n (&state.word, " ", 1);
                        }

is incorrect, since ptr is a pointer to a 'char' array, not to
'wchar_t' array.  Am I missing something?

> > >> There's two big hurdles to running the extension module: one is
> > >> building it, which it looks like you will manage to do; but the other
> > >> is loading it from a running Perl instance, and I still don't know if
> > >> that will succeed.
> > 
> > > What are the hurdles?  I'd expect Perl to be able to load a .dll file
> > > on Windows exactly like it loads a .so file on Unix.  The only issue
> > > might be telling Perl to look in the directory where Texinfo's "make
> > > install" puts them, but I expect texi2any to take care of that
> > > already, no?
> > 
> > If it wasn't compiled properly, e.g. with the wrong flags or header
> > files. Perl may find the .dll file and load it and things may break
> > afterwards when the code in the extension runs.
> 
> We shall see.

Progress report: With this command:

  make PERL_INC=/d/usr/Perl/lib/CORE LDFLAGS='-no-undefined 
-L/d/usr/Perl/lib/CORE -lperl520'

I can successfully compile the code, but linking it into a shared
library fails, see below.

Explanation of variables set on the Make command line:

  . PERL_INC needs to point to the native Perl's include directory;
    hopefully this will eventually done by configure

  . -no-undefined -- without this flag, libtool will not build shared
    libraries on Windows, because building a shared library on Windows
    cannot leave any unresolved references

  . -lperl520 and the corresponding -L switch need to be given to
    provide the import library for building the shared library, with
    symbols that tell the linker how to find functions defined by Perl
    itself that the XS extension calls; without that library I have
    tons of undefined references

This still fails to build the shared library, with these error
messages:

  libtool: link: gcc -shared  .libs/XSParagraph_la-XSParagraph.o 
mylib/.libs/XSParagraph_la-xspara.o mylib/.libs/XSParagraph_la-text.o   
-L/d/usr/Perl/lib/CORE -lperl520  -O2   -o .libs/XSParagraph-0.dll 
-Wl,--enable-auto-image-base -Xlinker --out-implib -Xlinker 
.libs/XSParagraph.dll.a
  .libs/XSParagraph_la-XSParagraph.o: In function 
`XS_XSParagraph_end_line_count':
  d:\gnu\svn\texinfo\trunk\tp\Texinfo\Convert\XSParagraph/XSParagraph.c:125: 
undefined reference to `_imp__Perl_pad_sv'
  .libs/XSParagraph_la-XSParagraph.o: In function `XS_XSParagraph_end_line':
  d:\gnu\svn\texinfo\trunk\tp\Texinfo\Convert\XSParagraph/XSParagraph.c:174: 
undefined reference to `_imp__Perl_pad_sv'
  .libs/XSParagraph_la-XSParagraph.o: In function `XS_XSParagraph_get_pending':
  d:\gnu\svn\texinfo\trunk\tp\Texinfo\Convert\XSParagraph/XSParagraph.c:200: 
undefined reference to `_imp__Perl_pad_sv'
  .libs/XSParagraph_la-XSParagraph.o: In function 
`XS_XSParagraph_set_space_protection':
  d:\gnu\svn\texinfo\trunk\tp\Texinfo\Convert\XSParagraph/XSParagraph.c:462: 
undefined reference to `_imp__Perl_pad_sv'
  mylib/.libs/XSParagraph_la-text.o: In function `text_printf':
  d:\gnu\svn\texinfo\trunk\tp\Texinfo\Convert\XSParagraph/mylib/text.c:34: 
undefined reference to `vasprintf'
  collect2.exe: error: ld returned 1 exit status
  Makefile:407: recipe for target `XSParagraph.la' failed
  make[1]: *** [XSParagraph.la] Error 1

The issue with vasprintf is clear (how do I solve it?), but
`_imp__Perl_pad_sv' is the import symbol corresponding to Perl_pad_sv
function, which is not found in the import library libperl520.a
distributed with my Perl.  I have no idea what that means (perhaps a
bug in the version of Perl I have? or maybe this function), and couldn't find 
anything on the
net to help me resolve this issue.

Is there perhaps a way to avoid calling Perl_pad_sv?  They all come
from lines like this:

        dXSTARG;

What does that do?

Here're the changes I needed to compile xspara.c:

Index: tp/Texinfo/Convert/XSParagraph/mylib/xspara.c
===================================================================
--- tp/Texinfo/Convert/XSParagraph/mylib/xspara.c       (revision 6409)
+++ tp/Texinfo/Convert/XSParagraph/mylib/xspara.c       (working copy)
@@ -6,7 +6,9 @@
 #include <stdio.h>
 #include <string.h>
 #include <locale.h>
+#ifndef _WIN32
 #include <langinfo.h>
+#endif
 #include <wchar.h>
 #include <wctype.h>
 
@@ -67,8 +69,132 @@
 
 static PARAGRAPH state;
 
+#ifdef _WIN32
 
+#define WIN32_LEAN_AND_MEAN
+#include <windows.h>
+#include <errno.h>
 
+char *
+w32_setlocale (int category, const char *value)
+{
+  if (_stricmp (value, "en_us.utf-8") != 0)
+    return NULL;
+
+  /* Switch to the Windows U.S. English locale with its default
+     codeset.  We will handle the non-ASCII text ourselves, so the
+     codeset is unimportant, and Windows doesn't support UTF-8 as the
+     codeset anyway.  */
+  return setlocale (category, "ENU");
+}
+#define setlocale(c,v)  w32_setlocale(c,v)
+
+static unsigned int
+utf16_to_ucs4 (const wchar_t *wc)
+{
+  unsigned int code;
+
+  code = 0x10000;
+  code += (wc[0] & 0x03FF) << 10;
+  code += (wc[1] & 0x03FF);
+  return code;
+}
+
+size_t
+mbrlen (const char * __restrict__ mbs, size_t n, mbstate_t * __restrict__ ps)
+{
+  unsigned char byte1 = *mbs;
+
+  if (ps != NULL)
+    {
+      errno = ENOSYS;
+      return -1;
+    }
+
+  return
+    ((byte1 & 0x80) == 0) ? 1 : ((byte1 & 0x20) == 0) ? 2 :
+    ((byte1 & 0x10) == 0) ? 3 : 4;
+}
+
+/* Convert a UTF-8 encoded multibyte string to a wide character.  */
+size_t
+mbrtowc (wchar_t * __restrict__ pwc, const char * __restrict__ mbs, size_t n,
+        mbstate_t * __restrict__ ps)
+{
+  if (mbs == NULL)
+    return 0;
+  else
+    {
+      wchar_t wc[2];
+      size_t n_utf16 = MultiByteToWideChar (CP_UTF8, MB_ERR_INVALID_CHARS,
+                                           mbs, n, wc, 3);
+      unsigned char byte1 = *mbs;
+
+      if (n_utf16 == 0)
+       {
+         errno = EILSEQ;
+         return (size_t)-1;
+       }
+      if (ps != NULL && n_utf16 > 1)
+       {
+         errno = ENOSYS;
+         return (size_t)-1;
+       }
+      if (pwc != NULL)
+       *pwc = wc[0];
+
+      return mbrlen (mbs, n, ps);
+    }
+}
+
+int
+iswspace (wint_t wc)
+{
+  /* According to Unicode's Proplist.txt:
+
+    0009..000D    ; White_Space # Cc   [5] <control-0009>..<control-000D>
+    0020          ; White_Space # Zs       SPACE
+    0085          ; White_Space # Cc       <control-0085>
+    00A0          ; White_Space # Zs       NO-BREAK SPACE
+    1680          ; White_Space # Zs       OGHAM SPACE MARK
+    2000..200A    ; White_Space # Zs  [11] EN QUAD..HAIR SPACE
+    2028          ; White_Space # Zl       LINE SEPARATOR
+    2029          ; White_Space # Zp       PARAGRAPH SEPARATOR
+    202F          ; White_Space # Zs       NARROW NO-BREAK SPACE
+    205F          ; White_Space # Zs       MEDIUM MATHEMATICAL SPACE
+    3000          ; White_Space # Zs       IDEOGRAPHIC SPACE  */
+
+  if ((wc >= 0x09 && wc <= 0x0D)
+      || wc == 0x20
+      || wc == 0x85
+      || wc == 0xA0
+      || wc == 0x1680
+      || (wc >= 0x2000 && wc <= 0x200A)
+      || wc == 0x2028
+      || wc == 0x2029
+      || wc == 0x202F
+      || wc == 0x205F
+      || wc == 0x3000)
+    return 1;
+
+  return 0;
+}
+
+/* FIXME: Provide a real implementation.  */
+int
+wcwidth (const wchar_t wc)
+{
+  return wc == 0 ? 0 : 1;
+}
+
+int
+iswupper (wint_t wc)
+{
+  return IsCharUpper ((WCHAR)wc);
+}
+
+#endif
+
 void
 xspara_hello (void)
 {



reply via email to

[Prev in Thread] Current Thread [Next in Thread]