Add Chinese words segmentation support. #4075

nvaccessAuto · 2014-04-15T10:16:18Z

Reported by vgjh2005 on 2014-04-15 10:16
Hi
Please add Chinese words segmentation support. Separating each word in English is very simple. But Chinese is a very very complex language. It is very difficult to separate them from a sentence or even an article. Making a correct segmentation could help us to appreciate what the article is expression. on the contrary, it will cause critical error. Text will be cutted by a space according to the word in braille. It also could be used to control numpad4 and numpad6 movement in document browse mode and screen browse mode. The numpad4 and numpad6 is the same as numpad1 and numpad3 that move by a character in Chinese.
There are two segmenting plugins that is based on python. Please choose the first one if possible.
NLPIR2014
Jieba
Thanks a lot!!!

LeonarddeR · 2017-07-17T17:57:24Z

@vgjh2005: I assume you mean this specifically for NVDA's review cursor commands?

vgjh2005 · 2017-07-18T13:53:15Z

Hi:
Do you know word segmentation? This technology could help us to understand text by braille. In Chinese braille, all text is seperated by word with space. Now, it is too hard to read something by braille display. Certainly, word navigation is also helpful.

Adriani90 · 2020-04-25T14:09:35Z

@vgjh2005 how is this behavior now in NVDA last alpha version?

Adriani90 · 2020-04-25T21:39:18Z

cc: @larry801, @dingpengyu could you please test if this is still an issue in the alst alpha version of NVDA?

cary-rowen · 2022-02-12T08:11:40Z

Hello @Adriani90 @LeonarddeR @seanbudd

This issue is still active. In short, due to the lack of support for Chinese word segmentation,
when Chinese users are reviewing Chinese content, the Numpad4 / Numpad6 can only move by character instead of word. So this excellent feature is almost ineffective for Chinese users.

I'd love to keep an eye on this issue and if nvaccess or anyone else needs more information from me I'll be happy to help.

Grateful

Adriani90 · 2024-03-25T13:47:59Z

Related to #16237.

Adriani90 · 2024-03-25T20:55:22Z

@cary-rowen I think you referenced #16237 in the wrong issue. I think though they are related. You have indeed proposed to include functionalities of @mltony's add-on, however, when we review such an issue on a core level we look at enhancements that can be done globally and your main problem is word handling which we don't have yet in NVDA core. @mltony's add-on is certainly a good point to start with and if that will be proposed as a PR, a wider part of the community will probably bring up new perspectives into it.

nvaccessAuto added the enhancement label Nov 10, 2015

nishimotz mentioned this issue Jun 28, 2016

"Speak typed words" behaves like "speak typed characters" when entering Asian text in Notepad and other edit fields and documents #2762

Open

nvaccessAuto mentioned this issue Aug 13, 2017

In notepad, only first character is announced when moving by word with multi-byte chars such as Asian words (specifically, Korean words) #2754

Open

feerrenrut added the feature/i18n Internationalization features label Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Chinese words segmentation support. #4075

Add Chinese words segmentation support. #4075

nvaccessAuto commented Apr 15, 2014

LeonarddeR commented Jul 17, 2017

vgjh2005 commented Jul 18, 2017

Adriani90 commented Apr 25, 2020

Adriani90 commented Apr 25, 2020

cary-rowen commented Feb 12, 2022

Adriani90 commented Mar 25, 2024

Adriani90 commented Mar 25, 2024

Add Chinese words segmentation support. #4075

Add Chinese words segmentation support. #4075

Comments

nvaccessAuto commented Apr 15, 2014

LeonarddeR commented Jul 17, 2017

vgjh2005 commented Jul 18, 2017

Adriani90 commented Apr 25, 2020

Adriani90 commented Apr 25, 2020

cary-rowen commented Feb 12, 2022

Adriani90 commented Mar 25, 2024

Adriani90 commented Mar 25, 2024