Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to read capitalized letters individually, as words or let synthesizer handle it #4520

Closed
nvaccessAuto opened this issue Oct 3, 2014 · 15 comments

Comments

@nvaccessAuto
Copy link

Reported by blindbhavya on 2014-10-03 08:20
I propose a combo box in NVDA to read capitalized letters with the following options
individual = this option instructs the synthesizer to read capitalized letters individually for e.g. NVDA will be read as n v d a
words = This option instructs the synthesizer to read capitalized letters as words for e.g. UNESCO will be read as unesco
let synthesizer handle it = This option allows the synthesizer to handle the reading of capitalized letters (which is what happens currently).
Hope this may be implemented.

@nvaccessAuto
Copy link
Author

Comment 1 by jteh on 2014-10-03 09:33
The problem is that one generally encounters both acronyms (UNESCO) and simple abbreviations (NVDA). Doing as you propose suggests the user would have to keep changing this depending on the content they were reading. Furthermore, most synthesisers already handle this, since they have knowledge of how the text is spoken. eSpeak, for example, pronounces both cases as expected.

As for the UNESCO example, how many people would really want this read as individual letters? This seems like a pretty rare use case. If it really is rare, I'd suggest this is advanced usage and can be easily achieved using regular expressions.

@nvaccessAuto
Copy link
Author

Comment 2 by blindbhavya on 2014-10-03 09:40
Hi.
1 I agree that ESpeak handles most cases correctly, but not all synths do so.
2 However, in cases where synths read capitalized words as words, the word is unintelligible. I don't mind synths reading any capitalized words as individual characters, so personally if such a feature would be implemented, I would configure the option as speak capitalized words as individual characters.
Hope I have expressed my views clearly.

@nvaccessAuto
Copy link
Author

Comment 3 by blindbhavya on 2014-10-03 09:41
Also, in terms of regular expressions, it is my belief that most average users (including me) wuldn't have much knowledge about it and wouldn't be aware of its power.

@nvaccessAuto
Copy link
Author

Comment 4 by Sukil on 2014-10-13 16:27
One suggestion though, speaking about capitalized letters. It may be good for proofreading to detect capital letters not only when reading character by character, but when reading line by line, and act accordingly (raise of pitch or beeps).

@nvaccessAuto
Copy link
Author

Comment 5 by jteh (in reply to comment 4) on 2014-10-13 23:04
Replying to Sukil:

It may be good for proofreading to detect capital letters not only when reading character by character, but when reading line by line, and act accordingly (raise of pitch or beeps).

This is covered by #3286.

@nvaccessAuto
Copy link
Author

Comment 6 by blindbhavya (in reply to comment 1) on 2014-11-09 12:57
Hi,
Though I have briefly answered your questions before, I am replying, with some additional experience, because I have been dealing with capitalized letters a lot , in the past two days.
Replying to jteh:

The problem is that one generally encounters both acronyms (UNESCO) and simple abbreviations (NVDA). Doing as you propose suggests the user would have to keep changing this depending on the content they were reading.

When capitalized letters are read as words, in most cases the word is pretty unintelligible, for example, consider the following mathematical statement:Triangle OPQ and XYZ are two congruent triangles
In this case, when you are reading line by line, you won't use anything, but read word by word and see how ESpeak pronounces them. I recently dealt with lots of Math consisting of capitalized letters (FYI I was dealing with different geometry related subjects such as Congruence and Properties Of Parallel Lines, where lots of angles, line segments, and geometrical shapes had capitalized names, which were difficult to read word by word).

Furthermore, most synthesisers already handle this, since they have knowledge of how the text is spoken. eSpeak, for example, pronounces both cases as expected.
ESpeak pronounces both cases as expected, okay, but that was only while reading line by line, see word by word. Also, till now I have had experience with only ESpeak and Eloquence (Eloquence for JAWS), where ESpeak failed word by word, whereas Eloquence failed in many more situations. On seeing Eloquence, I feel, that other synthesizers could also fail like eloquence, and anyways, ESpeak is not perfect either.

As for the UNESCO example, how many people would really want this read as individual letters?

I believe that there are more simple abbreviations as compared to acronyms, therefore, generally, one would want to get capitalized text spoken as individual letters. However, a person would generally only switch to read capitalized text as words only if they have to read something written in capital letters.
The two statements containing 'generally' are only my views, about how a user (like me) would use it, I don't say that I am right.

This seems like a pretty rare use case. If it really is rare, I'd suggest this is advanced usage and can be easily achieved using regular expressions.
Use Cases
1 While dealing with Maths one would switch to read individually.
2 For reading text that is written in capitals one would use read as words.
3 There is no existing synth that handles all this correctly, I totally understand that this is inevitable.
4 For proof reading one would switch to read as individual letters.
5 For general usage, I believe one would configure read as individual letters as the default setting, because generally there are more simple abbreviations as compared to acronyms (my belief only).
Anyways, that was just a repldetailed reply, could someone tell me how this could be temporarily achieved through regular expressions?

@nvaccessAuto
Copy link
Author

Comment 7 by Santoso on 2015-07-02 00:55
Hi,

I think this enhancement will be very useful as I also find it difficult to understand the reading when capitalized words are all spelled, like in articles or websites that (God knows why) use a lot of caps words.

Thanks

R

@bhavyashah
Copy link

A friendly ping in case any developer wants to give this feature request a try or @feerrenrut would like to assign this a priority.

@feerrenrut
Copy link
Contributor

Personally I'm not convinced that this is something that should be solved within NVDA. It would be much better if the synth was responsible for this, I think bugs should filed against synths that don't get this right. That said, if the same synth produces better results with other screen readers, then perhaps there is more that we can do.

@vimalan-sakthivel
Copy link

vimalan-sakthivel commented Aug 22, 2017

NVDA (version 2017.2) reads "NEW" as N.E.W.

@feerrenrut
Copy link
Contributor

@vimalan-sakthivel which synthesizer were you using? Espeak? Does this happen with other synthesizers?

When I test "NEW" with NVDA Version: master-14328,c6379eb3 (close to the upcoming release of NVDA 2017.3) using espeak I get the described behaviour. However when I use the Microsoft Speech API version 5 synthesizer "NEW" is spoken as "new".

@vimalan-sakthivel could you please create a new issue describing this on the espeak-ng repository: https://github.com/espeak-ng/espeak-ng/issues

Closing this issue since this seems like synth specific behaviour.

@bhavyashah
Copy link

@feerrenrut I would like to kindly request reopening this issue. This ticket does not deal with synth specific pronunciation bugs but with a feature request for a combo box to toggle whether capitalized words should be read individually, e.g. N V D A, as complete words, e.g. nvda, or to allow the synthesizer to handle it via its own pronunciation rules.

@feerrenrut
Copy link
Contributor

You're right, I didn't properly consider the initial request. But I don't think it really changes the outcome. To start with, I would categorise this as a new feature rather than a bug. As a feature, its not well defined enough, in terms of the problem it's trying to solve, and the value it would bring to users.

To start with I think its important for this issue to distinguish between an (pronounceable) acronym and an (unpronounceable) initialism. See https://www.dailywritingtips.com/acronym-vs-initialism/

I expect the process of trying to define this better (through triage) to be prohibitively expensive. So rather than leaving it open, I still think it's best to close the issue.
If you wish to persevere with this issue, I suggest first defining a set of concrete use cases, and trying to define general users stories for this. Also consider the situations that will be problematic (eg math in lower case, documents / titles / in upper case).

Even so it's not clear that we could come up with an implementation that would improve the situation from within NVDA.

Given the following example concrete use-cases
Acronym, pronounce by word:

  • UNESCO

Initialism, pronounce by letter:

  • NVDA
  • XYZ (in the context of math)
  • OPQ (in the context of math)

leads me to the following questions / thoughts:

  • What about (in the context of math) xyz (in lower case)?
  • Applying a rule based only on the upper/lower case-ness of the word will result in lots of edge cases that are not addressed (or made worse).
  • Having an option to swap between two outcomes (letter pronunciation / word pronunciation) complicates the usage of NVDA. Neither are likely to result in problem free behaviour in any single use case. This adds the complication of user configurable behaviour, without providing a concrete solution.
  • Adding to the rule a consideration for the length of the word mitigate the impact of some errors (making them less inconvenient). It would take less time to incorrectly read out the individual letters of STOP than reading the individual letters of STATIONARY for instance. There are still many edge cases, there are likely long initialisms.
  • It seems like the key factor on whether or not something should be said as a word, or as a set of letters, is based on whether it is considered pronounceable. This is still a fuzzy concept, the combination "xyz" could be considered pronounceable. In terms of knowing whether something is pronounceable, the synthesizer is in the best position to have the information to decide on that.

I still think that the best place to push for a fix to this is with the synthesizer. Unfortunately that means that it likely needs to be fixed by several synthesizers. I would suggest that the synthesizer pronounce letters in any cases where there is some doubt that the word is pronounceable, for instance "xyz". I would add an exception to this, if the word is longer than a certain limit (maybe 8 characters) then pronounce it as a word. Because a really long initialism is tedious to hear letter by letter, and when encountering nonsense its better for it to be over sooner. The user can always step through character by character to try to make sense of it.

@bhavyashah
Copy link

The reasoning you provide is very cogent. Thus, I guess the current status (closed) of this ticket is probably justified. It was interesting however to understand how any feature needs to be thoroughly evaluated from all angles, use cases, etc. Thanks @feerrenrut.

@vimalan-sakthivel
Copy link

@feerrenrut
you're correct. I used eSpeak NG. and like you mentioned this issue was not observed with Microsoft Speech API version 5. I have raised a new issue with eSpeak NG. Thank you!
espeak-ng/espeak-ng#301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants