Six reasons for preferring raw DITA to XLIFF for localization

Six reasons for preferring raw DITA to XLIFF for localization

DITA localization can be deployed in several ways, depending on the resources: with a CCMS or a content repository, with a translation management system managed in-house or by the Language Service Provider (LSP), with an internal or an external team of translators, and with a server, desktop or cloud solution.

All DITA localization processes look like this:

XLIFF, the interchange standard for the localization industry

OASIS, the organization which develops DITA, developed another open standard in 2002: XLIFF. XLIFF stands for XML Localization Interchange File Format. XLIFF was originally designed as an interoperability standard for exchanging content in the localization industry between the TMS and translator workbenches (2 in the diagram above), and it is now the exclusive standard. Its current specification is v2.1, released in February 2018.

Is XLIFF the right standard for CCMS and TMS integration?

Because it is for localization interchange, it seems adequate for exchanging files between the CCMS and the TMS. Most CCMS suppliers, in their localization settings, provide an option to export the content (1 in the diagram above) in XLIFF format instead of exporting the DITA content. This option does not exist when the content is managed in a content repository using Github or SVN, although some tools exist to convert DITA to XLIFF and back.

XLIFF offers no real advantage compared to DITA exchange

Every TMS accepts the XLIFF format. While it is valid for XLIFF v1, not all accept XLIFF v2.1 yet. On the other hand, they all have XML filters that can be tuned for DITA.

The XLIFF packages sent to the translators go through three alterations. First, they are segmented by the sentence rather than the XML element. Second, the appropriate localization filters are applied to leverage translation memories. Lastly, the sentences are segmented and grouped for sending to each translator in the project team.

XLIFF prevents using advanced processes

Thanks to the maturity of DITA, some TMS suppliers, and language providers have developed expertise and technology to support DITA localization better. The XLIFF format is usually not their preferred option; here are some of its shortfalls:

  • It can hide useful context information from the translators, such as in the to-be-translated content within a table, an image, or a list.
  • It can prevent the localization project manager from using XML editors, such as Oxygen or XMetaL when further investigation is needed.
  • It can prevent the translation processes from republishing the content using DITA OT and specialized plug-ins.
  • It can prevent a strict validation of the translated content against the DTDs or schemas.

We recommend transferring raw DITA content from your CCMS to your TMS

In our ten-plus years specializing in DITA localization, we have witnessed many process architects who had initially chosen XLIFF revert to raw DITA content to improve their processes, while we have not seen any taking the opposite route.

Converting your DITA content to XLIFF before sending it to localization is “safe and dirty”. Safe because any translator or translation company new to DITA will be able to translate some text. Dirty because it will not allow superior quality and productivity features that lower localization costs.

Related reading: Are you CMS or TMS-centric? How your architecture impacts your localization quality and processes.gram 

Recent posts