Transform XML to HTML in XSLT with string length condition -
i have xml file using tei build that:
<div type="chapter" n="1"> <p> <s xml:id="e_1">sentence e1.</s> <s xml:id="f_1">sentence f1</s> </p> <p> <s xml:id="e_2"> sentence e2</s> <s xml:id="f_2"> sentence f2</s> </p> </div> <div type="chapter" n="2"> <!-- --> </div>
i need transform html structure:
<div> <h1>chapter 1</h1> <div class="book-content"> <p> <span class='source-language-sent' data-source-id='1'>sentence e1.</span> <span id='1' style='display:none'>sentence f1</span> </p> <p> <span class='source-language-sent' data-source-id='2'>sentence e2</span> <span id='2' style='display:none'>sentence f2</span> </p> </div> </div> <div> <h1>chapter 2</h1> <div class="book-content"> <!-- --> </div> </div>
for use xslt file:
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:tei="http://www.tei-c.org/ns/1.0" version="1.0"> <xsl:output method="html" encoding="utf-8" indent="yes" /> <xsl:template match="tei:body"> <xsl:apply-templates /> </xsl:template> <xsl:template match="tei:teiheader"> <xsl:comment> <xsl:apply-templates select="node()" /> </xsl:comment> </xsl:template> <!--create chapter--> <xsl:template match="tei:div"> <xsl:element name="div"> <xsl:element name="div"> <xsl:attribute name="class"> <xsl:text>book-content</xsl:text> </xsl:attribute> <xsl:element name="h1"> <xsl:text>chapter</xsl:text> <xsl:value-of select="@n" /> </xsl:element> <xsl:apply-templates select="node()" /> </xsl:element> </xsl:element> </xsl:template> <!-- create p--> <xsl:template match="tei:p"> <xsl:element name="p"> <xsl:apply-templates /> </xsl:element> </xsl:template> <!-- create s--> <xsl:template match="tei:s"> <xsl:variable name="xmlid" select="@xml:id" /> <xsl:if test="starts-with($xmlid, 'e')"> <xsl:element name="span"> <xsl:attribute name="class"> <xsl:text>source-language-sent</xsl:text> </xsl:attribute> <xsl:attribute name="data-source-id"> <xsl:value-of select="substring($xmlid, 3, 4)" /> </xsl:attribute> <xsl:apply-templates select="node()" /> </xsl:element> </xsl:if> <xsl:if test="starts-with($xmlid, 'f')"> <xsl:element name="span"> <xsl:attribute name="style"> <xsl:text>display:none</xsl:text> </xsl:attribute> <xsl:attribute name="id"> <xsl:value-of select="substring($xmlid, 3, 4)" /> </xsl:attribute> <xsl:apply-templates select="node()" /> </xsl:element> </xsl:if> </xsl:template> </xsl:stylesheet>
my problem need create new <div class="book-content">
foreach 900 characters. don't want cut s
elements need calculate how many s
element have include in 1 <div class="book-content">
to have somethings 900 characters.
this interesting problem, example has of other things going on. prefer solve in isolation, using own example.
consider following input:
xml
<book> <chapter id="a"> <para> <sentence id="1" length="23">mary had little lamb,</sentence> <sentence id="2" length="29">his fleece white snow,</sentence> <sentence id="3" length="30">and everywhere mary went,</sentence> </para> <para> <sentence id="4" length="24">the lamb sure go.</sentence> <sentence id="5" length="34">he followed school 1 day,</sentence> </para> <para> <sentence id="6" length="27">which against rule,</sentence> <sentence id="7" length="35">it made children laugh , play</sentence> <sentence id="8" length="24">to see lamb @ school.</sentence> </para> <para> <sentence id="9" length="34">and teacher turned out, </sentence> <sentence id="10" length="27">but still lingered near.</sentence> </para> </chapter> <chapter id="b"> <para> <sentence id="11" length="35">summertime, , livin' easy.</sentence> <sentence id="12" length="40">fish jumpin' , cotton high.</sentence> <sentence id="13" length="52">oh, daddy's rich , mamma's lookin'.</sentence> <sentence id="14" length="35">so hush little baby, don't cry.</sentence> <sentence id="15" length="54">one of these mornings you're going rise singing.</sentence> </para> <para> <sentence id="16" length="57">then you'll spread wings , you'll take sky.</sentence> <sentence id="17" length="35">so hush little baby, don't cry.</sentence> </para> </chapter> </book>
note: length
values given illustration only; not using them in solution.
our task split each chapter total length exceeds 200 characters several chapters, moving whole sentences only, while preserving original para boundaries between groups of sentences.
xslt 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:exsl="http://exslt.org/common" xmlns:set="http://exslt.org/sets" extension-element-prefixes="exsl set"> <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- identity transform --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <xsl:call-template name="split-chapter"> <xsl:with-param name="nodes" select="para/sentence"/> </xsl:call-template> </xsl:template> <xsl:template name="split-chapter"> <xsl:param name="nodes"/> <xsl:param name="limit" select="200"/> <xsl:param name="remaining-nodes" select="dummy-node"/> <!-- 1. calculate total length of nodes --> <xsl:variable name="lengths"> <xsl:for-each select="$nodes"> <length> <xsl:value-of select="string-length()" /> </length> </xsl:for-each> </xsl:variable> <xsl:variable name="total-length" select="sum(exsl:node-set($lengths)/length)" /> <!-- 2. process chapter: --> <xsl:choose> <!-- if chapter long , can shortened ... --> <xsl:when test="$total-length > $limit , count($nodes) > 1"> <!-- ... try again 1 node less. --> <xsl:call-template name="split-chapter"> <xsl:with-param name="nodes" select="$nodes[not(position()=last())]"/> <xsl:with-param name="remaining-nodes" select="$remaining-nodes | $nodes[last()]"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <!-- otherwise create chapter current nodes ... --> <chapter id="{@id}" length="{$total-length}" > <!-- ... list paras participating in chapter ... --> <xsl:for-each select="$nodes/parent::para"> <para> <!-- ... , process nodes still left in each para. --> <xsl:apply-templates select="set:intersection(sentence, $nodes)"/> </para> </xsl:for-each> </chapter> <!-- process remaining nodes. --> <xsl:if test="$remaining-nodes"> <xsl:call-template name="split-chapter"> <xsl:with-param name="nodes" select="$remaining-nodes"/> </xsl:call-template> </xsl:if> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>
result
<?xml version="1.0" encoding="utf-8"?> <book> <chapter id="a" length="167"> <para> <sentence id="1" length="23">mary had little lamb,</sentence> <sentence id="2" length="29">his fleece white snow,</sentence> <sentence id="3" length="30">and everywhere mary went,</sentence> </para> <para> <sentence id="4" length="24">the lamb sure go.</sentence> <sentence id="5" length="34">he followed school 1 day,</sentence> </para> <para> <sentence id="6" length="27">which against rule,</sentence> </para> </chapter> <chapter id="a" length="120"> <para> <sentence id="7" length="35">it made children laugh , play</sentence> <sentence id="8" length="24">to see lamb @ school.</sentence> </para> <para> <sentence id="9" length="34">and teacher turned out, </sentence> <sentence id="10" length="27">but still lingered near.</sentence> </para> </chapter> <chapter id="b" length="162"> <para> <sentence id="11" length="35">summertime, , livin' easy.</sentence> <sentence id="12" length="40">fish jumpin' , cotton high.</sentence> <sentence id="13" length="52">oh, daddy's rich , mamma's lookin'.</sentence> <sentence id="14" length="35">so hush little baby, don't cry.</sentence> </para> </chapter> <chapter id="b" length="146"> <para> <sentence id="15" length="54">one of these mornings you're going rise singing.</sentence> </para> <para> <sentence id="16" length="57">then you'll spread wings , you'll take sky.</sentence> <sentence id="17" length="35">so hush little baby, don't cry.</sentence> </para> </chapter> </book>
Comments
Post a Comment