I do not know if this is of interest.
I tried the free software “pandoc” (https://pandoc.org/) to convert the XHTML file exported from Pressbooks. I did this, because the exported ODT file does not contain paragraph tags (Heading 1, Heading 2…) So the exported ODT is difficult to edit.
- I first use an XSLT to get the heading levels in the exported XHTML ‘right’ and remove toc and the chapter numbers (both are later dynamically created in Word).
- Then I use pandoc to convert the transformed XHTML to DOCX.
Pandoc is capable to pull in the referenced images. I can also reference a Word-Document that is then used for styling if I like. pandoc can also be used to create ODT file.
The result is astonishingly good. It has a quality that allows further editing and later import into Pressbooks (“roundtripping”). This is useful in case of major overhaul of a content where the author for some reason wants to make it outside Pressbooks.
I give the XSLT and the script that I used on the command line on a MAC to perform the action in case somebody wants to try.
Script:
#!/bin/sh
/usr/bin/java -jar /opt/saxon-he/saxon9he.jar -s:input.html -xsl:pandoc.xsl -o:output.html
pandoc output.html --data-dir . -f html -t docx -o output.docx
--data-dir .
meaning that the optional reference word document with name “reference.docx” is in the same directory.
pandoc.xsl:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml" xmlns:h="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="h">
<xsl:output method="xml" encoding="UTF-8" indent="no"
doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="h:div[@id = 'toc']"> </xsl:template>
<xsl:template
match="h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'chapter', ' '))]/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'chapter-title-wrap', ' '))]/h:h2">
<h1>
<xsl:apply-templates select="@* | node()"/>
</h1>
</xsl:template>
<xsl:template
match="h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'front-matter', ' '))]/h:div/h:h3"> </xsl:template>
<xsl:template
match="h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'back-matter', ' '))]/h:div/h:h3"> </xsl:template>
<xsl:template
match="h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'chapter', ' '))]/h:div/h:h3"> </xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h1">
<h2>
<xsl:apply-templates select="@* | node()"/>
</h2>
</xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h2">
<h3>
<xsl:apply-templates select="@* | node()"/>
</h3>
</xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h3">
<h4>
<xsl:apply-templates select="@* | node()"/>
</h4>
</xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h4">
<h5>
<xsl:apply-templates select="@* | node()"/>
</h5>
</xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h5">
<h6>
<xsl:apply-templates select="@* | node()"/>
</h6>
</xsl:template>
<xsl:template
match="h:div/h:div[contains(concat(' ', normalize-space(@class), ' '), concat(' ', 'ugc', ' '))]/h:h6">
<div custom-style="Titel o. Nr.">
<xsl:apply-templates select="@* | node()"/>
</div>
</xsl:template>
<!-- Copy as is everything else -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>