Language tags missing from PDF exports

We are using the Prince pdf export routine, and, while producing a series of books with a mix of languages, our digital publishing specialist discovered that language tags from the web book do not appear in the pdf exports. These are a required accessibility feature, and because there are thousands of them in these texts, this is not a remediation that we can manage by hand.

I do not have a ton of experience configuring Prince, but I was wondering if others ran into this same issue or could point me towards a potential fix. I also wanted to raise this issue as an accessibility concern if there was not any current pathway in the system to push these tags into the PDF.

1 Like

I’ll open an issue in the github repository, as I believe this is a bug. Here is a bit of my testing:

I notice that, oddly, footnotes with <span lang=”fr”> are added in the pdf correctly (e.g., the span shows up as a tag and that tag has the language as a property)

Pressbooks uses the XHTML for the source document for Prince. I notice that in the XHTML document, footnotes have <span lang=”fr”> while other places in the document have <span xml:lang=”fr”> or "<p xml:lang=”fr”> and, for some reason, Prince is not recognizing the xml:lang property but is recognizing the lang property.

After consulting the Prince documentation, I get a few leads. To cover my bases, I add @namespace xml “``http://www.w3.org/XML/1998/namespace”``; to the pdf css, with no effect, and then tell prince that the source of the digital pdf is xml by adding $prince->setInputType(‘xml’); to the Prince wrapper for digital pdfs. This does cause the PDF to generate with the appropriate language property in all of the places I had <span lang=”fr”> as well as other places like "<p xml:lang=”fr”> that were rendering as xml:lang in the xhtml.

It would be nice if someone with a bit more experience with this toolchain could review this issue and provide feedback.