Tuesday, 10 March 2015

SDL Studio CU8: Conflict between “Update Main Translation Memory” batch task and “Add as New Translation” function

With the cumulative update 8 for SDL Studio 2014, the batch task “Update Main Translation Memory” offers new functionalities and, above all, a new default setting which may change the way translators work.


New options for the “Update Main Translation Memory” batch task and new default setting

In the batch task prior to CU8, it was possible to “Always add new translation units” so that existing translation units with the same source segment remained in the memory and were not overwritten when the target segment differed. In this case, Studio created a duplicate translation unit.



Now, when you want to update a segment in the TM where the source is identical but the target is different in the sdlxliff file, it is possible to:
  • “Add new translation units”: a duplicate translation unit will be created. This is equivalent to the old option “Always add new translation units” in the former version.
  • Overwrite existing translation units: all existing translation units with the same source segment will be systematically deleted and replaced by the one in the sdlxliff. This option is the new default setting for the batch task.
  • Leave translation units unchanged: the batch task will only add translation units if the source segment is a new translation in the TM.
  • Keep most recent translation units: the last modification date of all existing translation units with the same source segment will be compared with each other, and the newest unit will remain in the TM. All others will be deleted.
The old behavior –only one translation unit is overwritten when duplicates are present – is no longer available. Either you overwrite all TUs which have the same source and keep only one, or don’t update any existing TUs at all.


Conflict with the “Add as New Translation” function 



The "Add as new translation" function is important when a segment should be translated differently depending on the context. Translators can store 2 different versions with the shortcut Ctrl + Shift + U. When you encounter such a segment, it will be regarded as a 99% fuzzy match because of the 1% default penalty for "Multiple Translations". The pre-translation process won't insert any translation in the sdlxliff file, and the translator will be able to consciously choose between both possibilities in the editor. The translator will get paid for this work just like for a "normal" 99% fuzzy match.

In a scenario where translators use this “Add as New Translation” function, the batch task “Update Main Translation Memory” with the default setting will delete all duplicates that were stored in the TM for the given segment. As the batch task is available to both translators and project managers, being unaware of this new setting can have big consequences for the translation resources: segments as described above will be recognized as 100% matches, which can mean not being paid for the segment. In addition, the segment may also have been potentially incorrectly pre-translated.


Here an example

File to update in the TM



TM before “Update Main Translation Memory”. It contains 2 duplicate pairs: TUs #1 and #3 have been intentionally stored by pressing Ctrl+Shift+U. TU #7 should have overwritten TU #5 but it was unexpectedly added in the TM.



With the old batch task, the TM would have remained as it is. With the new batch task and its default setting, the TM after “Update Main Translation Memory” looks like this:



Both TU #5 and TU #1 have been deleted. The deletion of TU #1 is counterproductive.


In the case you prefer not to take any risks with the deletion of important data, take time to set the default value on “Add new translation units” in your project templates.

The default setting will be re-discussed at SDL and could be changed through the cumulative update 10.

3 comments:

  1. Bonjour Sébastien,

    We stumbled on your very useful article when we couldn't understand how our multiple translations suddenly got deleted after CU8.

    We find it very concerning that SDL would change such an important default behavior during a CU and is even considering to change it again for the next CU.

    The thing is that none of the proposed options seem right to me. Overwrite will kill your multiple TUs that you have so carefully added over the years and Add TU will create multiples for when you only want to fix a bad translation or a typo.


    Shouldn't we have this option instead?

    If no multiple exist for the source segment, then overwrite
    If multiples exist for the source segment, then add a new TU if the target differs from existing targets.

    Thank you.

    Michel Farhi
    Senior Localization Manager
    National Instruments

    ReplyDelete
  2. By the way, our German translator tried going back to "Add new TU" option and she reports that after the third project, the multiples are still there, but are all 100% instead of 99%. Have you seen this behavior? We'll file a bug to support shortly about this. We work with Fas on reviewing these issues regularly. Thank you.

    ReplyDelete
  3. Bonjour Michel,
    Reintroducing the default option as it was before CU8 is the main idea. As far as I know, it is close to the one you describe but rely more on the many connections between the sdlxliff files and the TM: each time, when you translate a segment, a reference is inserted in your sdlxliff to point to the translation unit you used. Then, basically, if you make an offline change in the segment and then update online your TM with the batch task, Studio changes the relevant translation unit by means of the reference.
    And I write "basically" because the logarithm is much complicated than this short description and considers for example other parameters like the context of the segment to decide if a duplicate should be set instead. I am just aware of a part of the concept :o)
    This article from Paul Filkin's blog helped me to understand other aspects of the TM logic: http://multifarious.filkin.com/2013/02/13/100percent/, maybe it is also interessant to you.
    Concerning your second comment: I can’t remember experiencing many 100% at once, just a CM and many 100% when the context played a role.
    Thank you for your comments
    Kind regards
    Sébastien

    ReplyDelete