Bug 713278

Summary: Messages built from parts can not be translated
Product: [Fedora] Fedora Reporter: Göran Uddeborg <goeran>
Component: anacondaAssignee: Ales Kozumplik <akozumpl>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: akozumpl, anaconda-maint-list, jonathan, jzeleny, piotrdrag, vanmeeuwen+fedora
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: anaconda-16.15-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-03 17:13:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Göran Uddeborg 2011-06-14 20:57:05 UTC
While translating the latest version of anaconda, the master branch, on Transifex, I encountered these messages:

  RAID sets containing the %s must have one of the following raid levels: %s.

and

  RAID sets containing the %s must have one of the following metadata versions: %s.

To translate I needed to have an idea what the first %s would become, so I took a look in the code.  If I'm not mistaken, it can become a number of different things like "partition", "disk", etc.

While this might work in English, composing a sentence from parts in that way do not work in general.  Some languages have different forms of "the" depending on what word it applies to.  In Swedish, the definite form of a noun isn't constructed with a separate definite article at all, but with an ending of the word itself.  I'm sure other languages would have other issues.

So the bottom line is: composing sentences from parts in this way doesn't work if you want the sentence to be translatable.

Comment 1 Ales Kozumplik 2011-08-08 14:09:23 UTC
(In reply to comment #0)
> So the bottom line is: composing sentences from parts in this way doesn't work
> if you want the sentence to be translatable.

Would it help if the senteces didn't try so hard to look natural and instead used colon before the type description?

For instance instead of:

RAID sets containing the %s must have one of the following raid levels: %s.

We would have:
RAID sets that contain: "%s" must have one of the following raid levels: %s.

I know this would look more 'generated' on the screen but my hope is that it would allow more translations to look grammatically correct.

Comment 2 Göran Uddeborg 2011-08-09 09:08:30 UTC
I don't really know how well it would work in general.

My own language is Swedish, and that is a language relatively close to English.  For Swedish it would have worked if you just moved the word "the" inside the strings of the first %s.  But I have no idea how well that would work for Arabic, Urdu, or Chinese.

I suspect, though, that at least some other languages would have had further problems.  From what I've seen in discussions on translation mailing lists, I believe the only way to get natural messages in translations is to make complete original messages for each case.  It's tedious, both for the developer, and for many translators where it wouldn't have been necessary.  But it is the only thing that seems to work for everybody.

Some options:

1. Make a complete table over all combinations of the objects and the messages, and do a lookup at run time.  This will be a lot of messages, but it should work for all languages.

2. Move "the" into the strings being inserted.  If you have a message with "a %s" in some cases and "the %s" in other, both "a" and "the" needs to be moved.  So two separate strings are needed for each string that can be replaced with the %s.  This will fix it for Swedish, and probably several other relatively close languages.  Then you could wait for reports about other languages where it still won't work, and figure out how to solve their cases.  (If I knew exactly which strings were to be inserted, I could do this already.  But it isn't obvious when you sit with  the message catalog.  I would have to do a lot of code reading for each message.)

3. Do a more form-style message.  Something along the lines you suggest.  For Swedish, it would look roughly as good or as bad as it does in English.  I GUESS it would be the same for many other languages, but I don't really know what could happen.

Comment 3 Ales Kozumplik 2011-08-09 14:12:26 UTC
I see this is very tricky.

Having all combinations ready made is not attainable for this case of massages that appear only rarely.

Moving the articles inside the inserted strings creates another problem: sometimes the expressions appear at the beginning of the sentence and sometimes in the middle, so we would have first letter casing problem. And I am sure there are languages that change word order quite a lot.

In the proposed patch I used the form-style where necessary and removed the articles completely in the rest of the places:

https://www.redhat.com/archives/anaconda-devel-list/2011-August/msg00112.html

I know it's far from good but hopefully will allow you to go on with the translation. We need to learn to handle strings so they are more suitable for translations.

Comment 4 Göran Uddeborg 2011-08-09 16:39:43 UTC
The problem isn't trivial.  You've probably found a good compromise.  One little idea:  maybe it would become more "form like" with an additional newline?  Instead of

  RAID sets that contain: %s must have one of the following raid levels: %s.

what about

  RAID sets that contain: %s
  must have one of the following raid levels: %s.

(That is, if newlines are preserved when those messages are presented, of course.)

Comment 5 Ales Kozumplik 2011-08-15 06:49:28 UTC
Fixed by cb252fc2d488d159dc88d02a86df9668d4c9b093.

Comment 6 Göran Uddeborg 2011-08-18 21:44:16 UTC
Swedish translation updated on Transifex. :-)

Comment 7 Piotr Drąg 2012-04-03 17:13:25 UTC
Fixed as per comments #5 and #6.