lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] FOP text output for regression testing.


From: Evgeniy Tarassov
Subject: [lmi] FOP text output for regression testing.
Date: Wed, 27 Jun 2007 18:55:45 +0200

Previously FOP text output was considered useless for regression
testing since any minor formatting results in major changes in the
text output from FOP.

The main reason is that FOP tries to mimic the PDF output using fixed
grid. Problems start with font differences -- PDF could contain text
in slightly smaller fonts, but Text renderer could not represent this
formatting, and all it does -- it tries to sqeeze as much text as fits
in the same area, and the rest is simply truncated. Combined with
word-wrapping in PDF output, this makes text output not only
unreadable but also highly dependent on the formatting used in xsl-fo
files.

A possible workaround could be to make page size so huge, that no
word-wrapping occurs. Obviously such a text output is useless for a
user, but it is extremely useful for regression testing, since it does
not depend on the xsl-fo tags used to implement a particular page
layout (whether it was a table or a list, or simply a paragraph with
unbreakable-spaces).

The following patch
http://lmi.tt-solutions.com/codestriker/codestriker.pl?topic=4357235&action=view
implements a global parameter 'page-type' to *.xsl files which
controls the output page geometry. A special value 'txt' tells FOP to
generate a page of size 100x100 inches which is huge enough to avoid
word-wrapping.

I have locally bootstrapped a simple regression testing environment
which uses FOP text output as a final test for changes in *.xsl files.

So far I've been able to easily spot tiny discrepancies such as an
additional space before a dot.

What this small shell script does:
1. compares xsl-fo output (from xsltproc)

2. compares xsl-fo output with xml tags stripped off and spaces
normalized (only text is left, white-space is compacted)

3. compares text output (from FOP)

There are three main type of changes to *.xsl:
a) changes in the XSLT coding
Any programming done at the XSLT stage, such as loops, conditional
constructs (if, choose), templates, variables, etc.

b) changes in the page layout and text formatting
Any space changes, position adjustments, font size changes, etc.

c) text content changes
Changes in the text content.

(See below for examples for (a), (b), (c))

AFAIU from the point of view of LMI users:
a) is irrelevant, since it does not affect what user gets from the program
b) is visible, but not really important, since even if a page is
broken into two pages the output is still visible. No legal
consequences could result from a bogus change of type (b).
c) is really important, because it operates on the document content.

The tests I ran so far show that:
1. is sensible to any type of changes (a), (b) and (c)
2. is sensible only to (a) and (c).
3. is almost always insensitive to (a) and (b) and only detects
changes of type (c).

IMHO a perfect candidate for regression testing script would be
something that combines outputs from (2) and (3). Such a test would be
fully automated.

Such a testing could be implemented as a shell script, or as a target
in GNUmakefile (an addition to check_concinnity), or a test unit (i.e.
xsl_fo_test.exe).

Do you want me to spend time coding it? It should not take more than a
day to prepare a script with a bunch of test ledger XML files (for
example the *.xml files (ledger XML output) Richard sent some time ago
with proprietary information stripped off).

Examples of changes:
a) Such a change alters XSLT code but it does not change XSL-FO output.
-<xsl:if test="$is_composite">
-  Composite
-</xsl:if>
-<xsl:if test="not($is_composite)">
-  Not composite
-</xsl:if>
+<xsl:choose>
+  <xsl:when test="$is_composite">
+    Composite
+  </xsl:when>
+  <xsl:otherwise>
+    Not composite
+  </xsl:otherwise>
+</xsl:choose>


b) This changes XSL-FO tree and tags, possibly spacing or layout, but
does not alters text content.
-<fo:block font="sans-serif" font-size="10pt">
-  First paragraph
-</fo:block>
-<fo:block font="sans-serif" font-size="10pt">
-  Second paragraph
-</fo:block>
+<fo:block font="sans-serif" font-size="10pt" padding-top="1em">
+  <fo:block>First paragraph</fo:block>
+  <fo:block>Second paragraph</fo:block>
+</fo:block>


c) This alters output (pdf, txt) text.
<fo:block>
-  <xsl:text>Needless space control from XSLT with xsl:text tags.</xsl:text>
+  xsl:text is useless because of consequent XSL-FO processor space mangling.
</fo:block>

--
Best wishes,
Evgeniy Tarassov




reply via email to

[Prev in Thread] Current Thread [Next in Thread]