Associating a group of symbols (or any symbols in a particular group) with one pronunciation under symbols.dic #2676

nvaccessAuto · 2012-09-19T06:28:55Z

Reported by nvdakor on 2012-09-19 06:28
Hi,
In symbols.dic for some languages, apart from using regex, is it possible (or would it be possible) to perform the following:

Create a group of symbols.
The group of these individual symbols would be given a single pronunciation.
Any individual symbol within this pronunciation group would be using this one pronunciation when spoken.
This could be useful for tonal languages such as Korean and Vietnamese which associates one pronunciation for multiple individual symbols. This could also help with faster symbols processing, as a translator doesn't have to define same pronunciation (one per line) for any number of individual symbols.
For example, in the current symbols.dic syntax:
sym(tab)pronunciation(tab)punctuationLevel
And suppose if we wish to assign the letters "a" and "b" to be pronounced as "A":
a(tab)A(tab)none
b(tab)A(tab)none
Following the proposal above, we could say:
{a,b}(tab)A(tab)none
For providing punctuation levels for each individual symbols in the braces, I'd like to propose:
{(A(tab)puncLevel),(b}(tab)A(tab)none
With the priority given to puncLevel for symbols surrounded by parentheses.
Thanks.

nvaccessAuto · 2012-09-19T08:29:19Z

Comment 1 by jteh (in reply to comment description) on 2012-09-19 08:29
Thanks for the suggestion. My thoughts:

Replying to nvdakor:

In symbols.dic for some languages, apart from using regex,

Why is regex a problem? It's not particularly difficult to do this with a regex. You just enclose the symbols in square brackets; e.g. ![abc]. The only disadvantage is that you can only have 90 or so complex symbols.

{a,b}(tab)A(tab)none

One problem with this is that we'd have to escape {. For example, to match just {, the user would have to do { to distinguish it from a grouping. This would break user symbol files. Also, users would not be able to configure the pronunciation of the individual symbols if they wanted to, although maybe this is intended.

This could also help with faster symbols processing, as a translator doesn't have to define same pronunciation (one per line) for any number of individual symbols.

Internally, it won't really be any faster. The individual symbols will still be treated the same way.

For providing punctuation levels for each individual symbols in the braces, I'd like to propose:

{(A(tab)puncLevel),(b}(tab)A(tab)none

Even if we implement symbol grouping, I think symbol grouping with levels adds complexity (both to the code and for translators) with no advantage. If the level is defined separately, other parameters might be defined separately too. There's definitely no speed advantage here.

To summarise:

This is a fairly significant change that will have to be carefully implemented to avoid breaking user symbol files.
There is little to no speed advantage.

Given the above, the big question is: do you think this is hugely necessary or was it just a nice idea?

bhavyashah · 2017-08-14T04:46:04Z

@josephsl As the original author of this ticket, could you please respond to the questions asked in @jcsteh's #2676 (comment)?

Adriani90 · 2018-11-07T17:49:48Z

@josephsl, any updates regarding this issue? Does anyone work on it?

Adriani90 · 2024-03-03T17:38:39Z

Having this feature in the symbols.dic would really make things simpler and would reduce the complexity of the file. especially for mathematic alphanumeric characters, If I implement these in the symbols.dic (over 900 characters", there are multiple versions of letter a, b, c (doulbe struck, script, etc.) or multiple versions of numbers such as subscript, superscript etc. We don't need the full details of a character's name in the symbols.dic, so for example all multiple versions of the small letter a could be associated to the pronounciation "a" and so on.

Adriani90 · 2024-03-03T17:39:44Z

The full name of a character could then be retrieved with the help of an add-on from the Unicode databases etc. on demand (i.e. character information add-on which already exists).

nvaccessAuto added enhancement component/speech labels Nov 10, 2015

Adriani90 mentioned this issue May 10, 2020

symbols: support group references in replacements #11116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Associating a group of symbols (or any symbols in a particular group) with one pronunciation under symbols.dic #2676

Associating a group of symbols (or any symbols in a particular group) with one pronunciation under symbols.dic #2676

nvaccessAuto commented Sep 19, 2012

nvaccessAuto commented Sep 19, 2012

bhavyashah commented Aug 14, 2017

Adriani90 commented Nov 7, 2018

Adriani90 commented Mar 3, 2024

Adriani90 commented Mar 3, 2024

Associating a group of symbols (or any symbols in a particular group) with one pronunciation under symbols.dic #2676

Associating a group of symbols (or any symbols in a particular group) with one pronunciation under symbols.dic #2676

Comments

nvaccessAuto commented Sep 19, 2012

nvaccessAuto commented Sep 19, 2012

bhavyashah commented Aug 14, 2017

Adriani90 commented Nov 7, 2018

Adriani90 commented Mar 3, 2024

Adriani90 commented Mar 3, 2024