Opened 7 months ago

Closed 7 months ago

#2756 closed defect (fixed)

NVDA does not read a pdf document properly

Reported by: aliminator Owned by: jteh
Priority: minor Milestone: 2012.3
Component: Browse mode Version:
Keywords: regression Cc:
Operating system: Blocked by:
Blocking:

Description

In this scenario, NVDA does not read the document properly. It seems to be that especially for every word the first two letters are doubled and repeated.
Please have a look at this document located at
http://www.studentenwerk-giessen.de/docs/HSG/Speisepl%E4ne/THM-GI.pdf

You don't need to understand the contents.
But please compare the output of NVDA 2012.2.1 and 2012.3b3.
The former version works properly with the document and shows the words appropiately.
Adobe Reader X 10.1.4 was used.

Change History (9)

comment:1 Changed 7 months ago by aliminator

I forgot to mention that the issue occurred when tables should be displayed.
It could not be reproduced using another document with a table either.

comment:2 Changed 7 months ago by briang1

Yes, a little way down the line starts with the word pizza, and the second word according to the beta is..
DiDio
However, on the current release we get what it should be...
Diavolo

I have never seen this on english pdfs,with or without tables, though there is stil an anomoly where spaces between words will be removed if a graphic is in front of the text, or appears to be according to nvda.

comment:3 Changed 7 months ago by jteh

  • Component changed from Official builds to Browse mode
  • Keywords regression added

We did change the way text is retrieved to work around a bug in Adobe Reader which prevented us from retrieving any formatting. However, I don't see doubled characters with Adobe Reader XI (which is the latest version). Perhaps there's a fix in Reader XI which affects the way we're doing this now. Please test with Reader XI and report.

comment:4 Changed 7 months ago by jteh

changeset:00d0854d033ee61333d908c4c50e4ba93ca03c17 may help, though I still think this is probably a Reader bug.

comment:5 Changed 7 months ago by briang1

Well that file does work as far as reading is concerned in Adobe 11 on XP but control cursor navigate by word seems not to, it gets only part of each word, but once again its OK in a normal type pdf of course.
Hope that helps.

comment:6 Changed 7 months ago by briang1

Yes the fix just committed seems to have fixed it in reader X
The strange reading of words seems to be still there, but may well be something to do with the format of the table used. Not sure, but if a comma on the nend of the word it works.

Incidentaly, if anyone has a way to report bugs to Adobe maybe the one where the alert about removing protected mode at the start could be fixed by them. At present it says that the checbox fo this to help XP users see the content is in the general section of prefs, it is in fact now moved to enhanced security in adobe reader 11, but the text was obviously not changed before release.

comment:7 follow-up: Changed 7 months ago by aliminator

Hmm I tested this issue in Adobe Reader XI; no defect such as in X. It seems to be fixed, although in the table (especially in the first row) the last letter of each word is being read separately as if it is not one word. In braille, it is displayed correctly. But I think this is not a new one. Is there any ticket already opened for this issue?

comment:8 Changed 7 months ago by briang1

Just to get this so we are all talking about the same thing. My comment above was me testing the latest snap with the fix in it with reader X, and the comments about the control/cursor reading applies to both X and 11. I assume this is what you mean above. I noted that it read the words with the commas, but often missed the last char off of other words. Its a strange file this one, it would be interesting to know what produced it as I had some tabular info in a pdf in English and this was not happening there.

comment:9 in reply to: ↑ 7 Changed 7 months ago by jteh

  • Milestone set to 2012.3
  • Resolution set to fixed
  • Status changed from new to closed

Replying to aliminator:

It seems to be fixed, although in the table (especially in the first row) the last letter of each word is being read separately as if it is not one word.

I think this is a problem with the PDF. Those letters are split into separate nodes for some reason and formatting info can't be retrieved for them either. This is a fairly inaccessible PDF; no tagging for a start.

Marking as fixed as per comment:8.

Note: See TracTickets for help on using tickets.