Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rare watchdog and force quit/restart brokenness #5189

Closed
nvaccessAuto opened this issue Jul 1, 2015 · 3 comments
Closed

Rare watchdog and force quit/restart brokenness #5189

nvaccessAuto opened this issue Jul 1, 2015 · 3 comments
Assignees
Milestone

Comments

@nvaccessAuto
Copy link

Reported by jteh on 2015-07-01 01:32
This ticket covers several issues I discovered as a result of the problem reported in this thread.

Deadlock in Watchdog termination

When watchdog terminates, there is a rare chance that a deadlock could occur. This will happen if:

  1. watchdog.alive is called.
  2. The timeout elapses and the watcher wakes.
  3. Either watchdog.alive isn't called again or the watcher doesn't detect it before the next step.
  4. watchdog.terminate is called, thus signaling the timer.
  5. Meanwhile, the watcher is still waiting for the core to be alive and eventually attempting recovery.

I wasn't actually able to reproduce this in the way reported, but I'm pretty sure this is the issue. It is very much sensitive to timing.

The solution is to make watchdog._isAlive always return True if the watchdog has been terminated.

Inability to force quit/restart if NVDA freezes during exit

If watchdog freezes during termination as described above, trying to force quit or restart NVDA (e.g. by pressing control+alt+n) doesn't work. The problem is that we destroy the window that NVDA uses to detect a previous copy of NVDA before terminating watchdog. So, when watchdog termination freezes, the new copy of NVDA can't find the previous copy to kill it. This could also happen if anything else freezes during termination after the window is destroyed.

The solution is to destroy the window as late as possible.

Recovery attempts after core goes to sleep

The following could happen:

  1. watchdog.alive is called.
  2. The timeout elapses, so the core is treated as dead.
  3. The watcher wakes, detecting core death.
  4. watchdog.asleep is called.

In this case, the watcher won't stop recovery attempts until watchdog.alive is called again, even though watchdog.asleep was called (indicating that the core is now asleep, not dead).

I haven't been able to reproduce this in practice, but I'm pretty sure it's possible.

The solution is to call watchdog.alive at the top of watchdog.asleep to "reset" the watchdog. Note that CancelWaitableTimer does not reset the timer to unsignaled if it was signaled, which is the case here. Calling watchdog.alive will do this because it calls SetWaitableTimer.

@nvaccessAuto
Copy link
Author

Comment 1 by James Teh <jamie@... on 2015-07-01 01:48
In [0bc9800]:

Merge branch 't5189' into next

Incubates #5189.

Changes:
Added labels: incubating

@nvaccessAuto
Copy link
Author

Comment 2 by jteh on 2015-07-01 01:50
Ronan, please try the "next" snapshot for 1 July (which will be available in about 8 hours) and report whether this fixes the issue you reported. A temporary/portable copy is fine. Thanks.

@nvaccessAuto
Copy link
Author

Comment 3 by James Teh <jamie@... on 2015-07-15 07:46
In [8e912ec]:

Fix a rare deadlock in watchdog termination, inability to force quit/restart NVDA if it freezes during exit and another rare watchdog edge case.

See the code comments and/or the ticket for full details.
Fixes #5189.

Changes:
Removed labels: incubating
State: closed

@nvaccessAuto nvaccessAuto added this to the 2015.3 milestone Nov 10, 2015
jcsteh added a commit that referenced this issue Nov 23, 2015
…restart NVDA if it freezes during exit and another rare watchdog edge case.

See the code comments and/or the ticket for full details.
Fixes #5189.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants