409965 – [visualizer] Allow all-stop mode for multicore visualizer

Bug 409965 - [visualizer] Allow all-stop mode for multicore visualizer

Summary: [visualizer] Allow all-stop mode for multicore visualizer

Status:	RESOLVED FIXED

Alias:	None

Product:	CDT
Classification:	Tools
Component:	cdt-debug-dsf-gdb (show other bugs)
Version:	Next
Hardware:	PC Linux

Importance:	P3 normal (vote)
Target Milestone:	8.3.0
Assignee:	Marc Dumais
QA Contact:	Marc Khouzam

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-06-05 10:27 EDT by Marc-André Laperle
Modified:	2014-01-15 09:22 EST (History)
CC List:	4 users (show)

See Also:

Attachments
Project and binary (136.10 KB, application/gzip) 2013-06-05 10:32 EDT, Marc-André Laperle	no flags	Details
The normal visualizer display, for a "non-stop" debug session (44.18 KB, image/png) 2013-06-26 13:37 EDT, Marc Dumais	no flags	Details
The proposed display, for an "all-stop" debug session (42.71 KB, image/png) 2013-06-26 13:37 EDT, Marc Dumais	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Marc-André Laperle

2013-06-05 10:27:27 EDT

C/C++ Development Tools 8.2.0.201306051012
C/C++ Multicore Visualizer 1.2.0.201306051012
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu

I'm using GDB in all-stop mode and I have the Visualizer opened. When I start debugging, I get this in the error log every time:

Request for monitor: 'RequestMonitor (org.eclipse.cdt.dsf.gdb.multicorevisualizer.internal.ui.view.MulticoreVisualizerEventListener$2@222b60f): Status ERROR: org.eclipse.cdt.dsf code=10001 Target not available. null' resulted in an error.

Comment 1 Marc-André Laperle

2013-06-05 10:32:48 EDT

Created attachment 231993 [details]
Project and binary

The CDT project was created on Linux (Ubuntu 13.04) and requires gtk installed. The binary is 64 bit x86.

Comment 2 Marc Khouzam

2013-06-05 10:34:04 EDT

The Multicore Visualizer is not meant to be used in all-stop mode.

That being said, we should handle the case gracefully.  Maybe we should put a message in the visualizer view saying that "All-stop mode is not supported"

Comment 3 Marc-André Laperle

2013-06-05 10:55:05 EDT

(In reply to comment #2)
> The Multicore Visualizer is not meant to be used in all-stop mode.

I'm using all-stop because I get a SIGSEGV in the inferior when I use non-stop. Is it because the visualizer code doesn't wait until the inferior is stopped before sending mi? or because of a GDB limitation?

(In reply to comment #2)
> That being said, we should handle the case gracefully.  Maybe we should put
> a message in the visualizer view saying that "All-stop mode is not supported"

Sounds good to me.

Comment 4 Marc Dumais

2013-06-05 13:41:14 EDT

I see this error too, in the .metadata/.log file on my workspace.

One scenario that triggers the error is the creation of a new thread in the debugged program, when we attempt to send the "thread-info" command to GDB.

Comment 5 Marc Khouzam

2013-06-05 13:59:49 EDT

(In reply to comment #3)
> (In reply to comment #2)
> > The Multicore Visualizer is not meant to be used in all-stop mode.
> 
> I'm using all-stop because I get a SIGSEGV in the inferior when I use
> non-stop. 

Why is that?

> Is it because the visualizer code doesn't wait until the inferior
> is stopped before sending mi? or because of a GDB limitation?

So non-stop uses what GDB calls target-async which allows to keep GDB responsive, even when the inferior is running.  This is also available in all-stop mode but CDT does not use it.  The effort required to migrate to target-async has not been justified (yet?).
Because CDT uses all-stop in a way that prevents communication while the inferior is running, it make the visualizer much less useful: it would only display things when the inferior is stopped.

To be honest, I haven't looked into that case at all.  Since /// focuses on non-stop, I never felt it was right to spend much time on that.

(In reply to comment #4)
> I see this error too, in the .metadata/.log file on my workspace.
> 
> One scenario that triggers the error is the creation of a new thread in the
> debugged program, when we attempt to send the "thread-info" command to GDB.

Probably because the visualizer requests information when the program is running, and GDB is not available (all-stop mode).

Comment 6 Marc Dumais

2013-06-05 14:31:17 EDT

(In reply to comment #5)
> Probably because the visualizer requests information when the program is
> running, and GDB is not available (all-stop mode).

I see the error even when stepping-into the program.  In that case, the "thread-info" command for the new thread triggers the error, but then a second occurrence of that same command (but a different caller - not the multicore visualizer) works.  I think maybe the visualizer asks for the thread info before the step has finished executing, while the other is done after? 

Here is the stack trace for both cases: 

Thread [org.eclipse.cdt.dsf.gdb - 3] (Suspended (breakpoint at line 321 in CommandCache))	
	CommandCache.execute(ICommand<V>, DataRequestMonitor<V>) line: 321	
	GDBProcesses_7_2_1(GDBProcesses_7_1).getExecutionData(IThreadDMContext, DataRequestMonitor<IThreadDMData>) line: 171	
	MulticoreVisualizerEventListener.handleEvent(IRunControl$IStartedDMEvent) line: 178	
	GeneratedMethodAccessor56.invoke(Object, Object[]) line: not available	
	DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25	
	Method.invoke(Object, Object...) line: 597	
	DsfSession.doDispatchEvent(Object, Dictionary<?,?>) line: 519	
	DsfSession.access$2(DsfSession, Object, Dictionary) line: 463	
	DsfSession$3.run() line: 390	
	DefaultDsfExecutor$TracingWrapperRunnable.run() line: 374	
	Executors$RunnableAdapter<T>.call() line: 439	
	FutureTask$Sync.innerRun() line: 303	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>(FutureTask<V>).run() line: 138	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.access$301(ScheduledThreadPoolExecutor$ScheduledFutureTask) line: 98	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.run() line: 206	
	ThreadPoolExecutor$Worker.runTask(Runnable) line: 886	
	ThreadPoolExecutor$Worker.run() line: 908	
	Thread.run() line: 662	



Thread [org.eclipse.cdt.dsf.gdb - 3] (Suspended (breakpoint at line 321 in CommandCache))	
	CommandCache.execute(ICommand<V>, DataRequestMonitor<V>) line: 321	
	GDBProcesses_7_2_1(GDBProcesses_7_1).getExecutionData(IThreadDMContext, DataRequestMonitor<IThreadDMData>) line: 171	
	ThreadVMNode.updatePropertiesInSessionThread(IPropertiesUpdate[]) line: 320	
	AbstractThreadVMNode$3.run() line: 170	
	DefaultDsfExecutor$TracingWrapperRunnable.run() line: 374	
	Executors$RunnableAdapter<T>.call() line: 439	
	FutureTask$Sync.innerRun() line: 303	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>(FutureTask<V>).run() line: 138	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.access$301(ScheduledThreadPoolExecutor$ScheduledFutureTask) line: 98	
	ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.run() line: 206	
	ThreadPoolExecutor$Worker.runTask(Runnable) line: 886	
	ThreadPoolExecutor$Worker.run() line: 908	
	Thread.run() line: 662

Comment 7 Marc Dumais

2013-06-06 07:19:32 EDT

(In reply to comment #2)
> The Multicore Visualizer is not meant to be used in all-stop mode.
> 
> That being said, we should handle the case gracefully.  Maybe we should put
> a message in the visualizer view saying that "All-stop mode is not supported"

One comment about this: It's true that not all debug info is available in all-stop mode, but the visualizer can still be used.  For instance, the load monitoring works in all-stop mode.

Comment 8 Marc Khouzam

2013-06-06 13:43:44 EDT

(In reply to comment #7)

> One comment about this: It's true that not all debug info is available in
> all-stop mode, but the visualizer can still be used.  For instance, the load
> monitoring works in all-stop mode.

Currently for local debug sessions, we go straight to Linux (not using GDB), so you are right that it works.  However, for remote sessions we use GDB, and in the future, we plan on using GDB all the time, so we'll run into the same problem.

Comment 9 Marc Dumais

2013-06-07 07:11:58 EDT

(In reply to comment #8)
> Currently for local debug sessions, we go straight to Linux (not using GDB),
> so you are right that it works.  However, for remote sessions we use GDB,
> and in the future, we plan on using GDB all the time, so we'll run into the
> same problem.

Yes, you're right.  So then it seems we have a limitation with the way we use the debugger that makes it difficult to consistently provide the required information to the visualizer, in the all-stop mode.  

I'll look into making a patch to implement something along the line of the suggestion in comment #2.

Comment 10 Marc Dumais

2013-06-26 13:36:00 EDT

To recap:  the error message that Marc-Andre reported in this bug happens when doing debugging in the "all stop" mode, with the visualizer open.  The error happens whenever a new thread is spawned.  

Discussing the "all stop" mode, it became clear that the multicore visualizer does not support that mode very gracefully, due to limitations in the DSF-GDB services and/or GDB itself. 

It was proposed that, in "all-stop" mode, the visualizer should not try to display its model, but instead display an empty canvas with an explanation message.

I have a patch that I will shortly submit that attempts to address this.

Here are the highlights: 
- a validity flag was added to the model.  If the model is set as invalid, a string can be provided to explain why.
- a check was added in getCPUs() to check if the current debug session is of the type "all-stop".  If it is, the model is marked as invalid and its construction is stopped.
- in the canvas, a check of the model validity is made before drawing the CPUs, cores and threads.  If the model is marked as invalid, the reason is displayed in the canvas's status bar area.

Comment 11 Marc Dumais

2013-06-26 13:37:12 EDT

Created attachment 232808 [details]
The normal visualizer display, for a "non-stop" debug session

Comment 12 Marc Dumais

2013-06-26 13:37:43 EDT

Created attachment 232809 [details]
The proposed display, for an "all-stop" debug session

Comment 13 William Swanson

2013-06-26 14:13:00 EDT

Question: rather than disable the visualizer outright in all-stop mode,
is it possible to simply notice that the target's GDB is running under
all-stop mode and just not try to gather information until the next time
it stops?

True, you would only see the visualizer's display updating when you stop
or step, but this is likely to be "good enough", since that's when you'd
want to look at the current state of the world in detail anyway.

I ask because we tend to use all-stop-like debugging here (not GDB's
all-stop mode but a custom solution since we use multiple GDB's),
and this is in part because our experience was that all-stop seems
to be generally easier for users to grok and use. Hence I have a minor
interest/bias in support of all-stop mode. :-)

As a side note, we also generally assume that while things are running,
the Grid View only displays a "rough" version of what's going on, based
on what information it receives. It's only when you're stopped at a breakpoint
that we can provide a completely accurate picture. So there's precedent
for having the "running" state be a less-precise view of things.


(From comment #5)
> 
> So non-stop uses what GDB calls target-async which allows to keep GDB
> responsive, even when the inferior is running.  This is also available in
> all-stop mode but CDT does not use it.  The effort required to migrate to
> target-async has not been justified (yet?).
> Because CDT uses all-stop in a way that prevents communication while the
> inferior is running, it make the visualizer much less useful: it would only
> display things when the inferior is stopped.
> > 
> > One scenario that triggers the error is the creation of a new thread in the
> > debugged program, when we attempt to send the "thread-info" command to GDB.
> 
> Probably because the visualizer requests information when the program is
> running, and GDB is not available (all-stop mode).

(From comment #8)
> Currently for local debug sessions, we go straight to Linux (not using GDB), > so you are right that it works.  However, for remote sessions we use GDB, and > in the future, we plan on using GDB all the time, so we'll run into the same > problem.

Comment 14 Marc Dumais

2013-06-27 08:21:34 EDT

Hi Bill,

Thanks for the comments.  I have talked with Marc K. and we have a few ideas on how we could make the visualizer react better in the all-stop mode. So we'll give this a try and put the current patch on the ice. 

Regards,

Marc

Comment 15 Marc Dumais

2013-06-27 13:48:41 EDT

Hi,

Taking Bill's suggestions into consideration, I have a new patch that will soon be ready for review.

This time, instead of disabling the visualizer's display in the "all-stop" mode, I try to make it work a bit better in that mode. As noted, it will never be perfect, since GDB can't answer while it's running.

The main changes are in MulticoreVisualizerEventListener.java, and apply only when in the "all-stop" mode:

- IStartedDMEvent event, received when a thread is spawned: since we can't ask GDB for the core and OS thread id for the newly spawned thread, I still add the thread to the model but with bogus core and thread id info (both set to zero). I figure it's better to have a placeholder thread in the visualizer than nothing.

- ISuspendedDMEvent event, received when a thread has stopped: in this scenario, I go ahead and have the visualizer recreate its model. Since the execution is stopped, we should be able to get the information about all the threads from GDB. So in effect, this re-synchronizes the visualizer display with the current GDB state, for that session.

There are remaining scenarios where the "all-stop" mode is still glitchy. For instance, with the program running, if the user de-selects and re-selects the debug session in the debug view. This triggers the re-creation of the model, and since GDB can't answer, we get no thread info at all. Another case is for remote sessions, when the user unselects the "stop on startup at... main" option from the debug configuration. In that case, it's not possible at startup to get the CPU/cores info and so nothing much will be displayed.

In both case, the visualizer will again show the correct info upon the execution stopping.

Comment 16 Marc Dumais

2013-06-27 14:42:39 EDT

Patch ready for review:

https://git.eclipse.org/r/#/c/14118/1

Comment 17 Marc-André Laperle

2013-07-10 00:14:30 EDT

(In reply to comment #5)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > The Multicore Visualizer is not meant to be used in all-stop mode.
> > 
> > I'm using all-stop because I get a SIGSEGV in the inferior when I use
> > non-stop. 
> 
> Why is that?

I'm debugging native code that Eclipse is using (swt, gtk, etc). There is a case where the jvm actually triggers SIGSEGV and has a signal handler that handles it gracefuly. This seems to be by design:

JVM_handle_linux_signal()
// stuff
    if (sig == SIGSEGV)
      // other stuff
      return true;
    }

It actually happens in all-stop too but it's not as noticeable since eveything is stopped so the JVM doesn't send the signal in the background while I debug another thread.

If I set gdb to not handle this signal, then all it well:
(gdb) handle SIGSEGV nostop noprint

Comment 18 Marc-André Laperle

2013-07-10 12:56:58 EDT

(In reply to comment #16)
> Patch ready for review:
> 
> https://git.eclipse.org/r/#/c/14118/1

I was testing your fix in all-stop mode and got a NPE. It might not be specific to all-stop so I'll create another bug if it doesn't happen in non-stop.

Caused by: java.lang.NullPointerException
	at org.eclipse.cdt.dsf.gdb.multicorevisualizer.internal.ui.view.MulticoreVisualizer.workbenchToVisualizerSelection(MulticoreVisualizer.java:849)
	at org.eclipse.cdt.dsf.gdb.multicorevisualizer.internal.ui.view.MulticoreVisualizer.updateCanvasSelectionInternal(MulticoreVisualizer.java:988)
	at org.eclipse.cdt.dsf.gdb.multicorevisualizer.internal.ui.view.MulticoreVisualizer.updateCanvasSelectionInternal(MulticoreVisualizer.java:981)
	at org.eclipse.cdt.dsf.gdb.multicorevisualizer.internal.ui.view.MulticoreVisualizer$4.run(MulticoreVisualizer.java:973)
	at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
	at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:135)
	... 24 more

Comment 19 Marc Khouzam

2013-07-11 11:14:59 EDT

What do you think about renaming this bug: "Allow all-stop mode for multicore visualizer"?

I think it will help if we ever look for this change in the future.

Comment 20 William Swanson

2013-07-11 11:49:36 EDT

Agreed -- in general it's a good idea once a bug has been evaluated
to have the summary reflect the assessment of what actually needs fixing.

And if the original content of the summary line, i.e. the error message,
is in the comments as it is here, it's still searchable.

Comment 21 Marc Dumais

2013-07-16 15:19:08 EDT

Hi,

Thanks Marc-Andre and Marc for the review.  I will shortly push a revised patch.

One case that was not covered in the review is that the canvas filtering involving threads had an issue in all-stop mode:

When threads are created while the program is running, we can't ask GDB for detailed information about the new threads.  So we instead add them to the model with some placeholder values (os tid=0, core=0, state=RUNNING).   

The canvas filter relies on the "OS thread id" to identify filtered threads (using VisualizerThread.compareTo() and VisualizerThread.getID()).  So if the user filters on thread(s) displayed with placeholder info, the filter would fail to work once the visualizer has been updated with the correct info; for instance once the execution is stopped on a breakpoint.  At that point, placeholder threads selected as part of the filter would no longer be shown, since they were not recognized by the filter.

I have modified VisualizerThread so that compareTo() and getID(), that are used by the filter, now rely on the "gdb thread id" instead of "os thread id".  The reason this works is that the "gdb thread id" is available from the context when a thread is created, and so is always correct from the start, even in all-stop mode.  

This change will be included in the updated patch.

Comment 22 Marc-André Laperle

2013-08-15 10:22:41 EDT

Patch applied to master. Thanks everyone! :)

Comment 23 Marc-André Laperle

2013-08-15 12:06:10 EDT

Marking as fixed.