572880 – Disassembly should show opcodes as byte sequence

Bug 572880 - Disassembly should show opcodes as byte sequence

Summary: Disassembly should show opcodes as byte sequence

Status:	RESOLVED FIXED

Alias:	None

Product:	CDT
Classification:	Tools
Component:	cdt-debug-hardware (show other bugs)
Version:	Next
Hardware:	PC Windows 10

Importance:	P3 enhancement with 7 votes (vote)
Target Milestone:	10.3.0
Assignee:	Marc Ernst
QA Contact:	John Dallaway

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2021-04-15 08:37 EDT by Marc Ernst
Modified:	2021-05-23 00:27 EDT (History)
CC List:	7 users (show)

See Also:	Gerrit Change Gerrit Change Git Commit Git Commit 573157 Gerrit Change Gerrit Change Git Commit Git Commit

Attachments
ellipsis (16.60 KB, image/png) 2021-05-18 13:50 EDT, Jonah Graham	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Marc Ernst

2021-04-15 08:37:56 EDT

Opcodes should be shown as a byte sequence instead of hex. It might be good to have also an option to switch between both display modes. GDB supports the opcode as byte sequence with disass /r. In addition users should be able to modify the single bytes.

If there is an agreement if and how this can be implemented I can help with creating a patch.

Comment 1 Jonah Graham

2021-04-15 09:34:39 EDT

Hi Marc,

On first pass that sounds completely reasonable. This bug seems to cover two different things:
1- how opcodes (bytes) are displayed/rendered
2- add ability to edit opcodes (bytes)

I think there is a definite yes to it being implemented. The how is a bit more of a conversation that I will start to kick off here with some questions/pointers/comments:

- CDT currently received the raw opcodes from GDB[1] and converts them into a BigInteger, so it seems reasonable to keep them in their original form and pass that up to the UI.

- Other than the alignment issue (I personally file disass /r hard to read because insns are not aligned) I don't see much value in providing the option to display the BigInteger version of the opcodes in the UI. The opcodes are displayed using rulers, so the alignment probably comes for free.

- The DisassemblyView is (AFAIK) a read-only UI, so adding editing functionality may be challenging to provide inline. I don't know if there is a way to have a ruler have an editable component. Having a pop-up should be possible though.

- The API around DisassemblyView is complicated because of the async nature and because it used to support two different backends (DSF and now deleted CDI). This means a lot of stuff is listed as API that probably shouldn't be. So there will be some discussions on how to mitigate that.

- MIDataDisassemble (the wrapper for MI's -data-disassemble) has not been updated in a while. A few releases back GDB added new modes (4 and 5). I don't know if you want one of the improved modes to achieve this fully.

I hope that is enough for you to get started on an implementation/patch. To make it easier to review I think that the display and editing aspects should be split into (at least) two commits, they can be dependent on each other if needed. Please let me know if you have any questions.

Thanks!
Jonah

[1] https://git.eclipse.org/r/plugins/gitiles/cdt/org.eclipse.cdt/+/refs/heads/master/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/output/MIInstruction.java#158

Comment 2 Marc Ernst

2021-04-16 07:03:55 EDT

Hi Jonah,
thanks a lot for your fast response and the really helpful information. I can work now on a first patch for showing the bytes in the disassembly view.

Kind regards,
Marc

Comment 3 Eclipse Genie

2021-04-21 01:14:51 EDT

New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/179579

Comment 4 Eclipse Genie

2021-04-21 01:17:06 EDT

New Gerrit change created: https://git.eclipse.org/r/c/tcf/org.eclipse.tcf/+/179580

Comment 5 Eclipse Genie

2021-04-23 15:45:15 EDT

Gerrit change https://git.eclipse.org/r/c/tcf/org.eclipse.tcf/+/179580 was merged to [master].
Commit: http://git.eclipse.org/c/tcf/org.eclipse.tcf.git/commit/?id=8a14461365b15cff538d739e36daf54e0c558d10

Comment 6 Eclipse Genie

2021-04-23 16:12:31 EDT

Gerrit change https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/179579 was merged to [master].
Commit: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/commit/?id=766d6fec6a60ea4e18dd9569859c9fda7bece78f

Comment 7 Jonah Graham

2021-04-23 16:15:17 EDT

This is all merged now. Just needs a N&N entry with a screenshot showing the nice new UI. https://wiki.eclipse.org/CDT/User/NewIn103

@Marc, are you able to write the N&N? Please create a new bug for the follow on work of editing opcodes.

Comment 8 Marc Ernst

2021-04-26 02:04:03 EDT

Hello Jonah,
I've created a new bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=573157 for the editing functionality. In addition I've created a short N&N which is in review currently.

Thanks,
Marc

Comment 9 Jonah Graham

2021-04-26 09:21:19 EDT

Thanks Marc - I like the new and improved look of that ruler column!

Someone else got to the moderation queue before me, hopefully they have marked your changes to be auto-approved in the future, if not please ping me and I will do that.

Comment 10 John Dallaway

2021-05-11 14:18:05 EDT

This change is problematic for word-addressable target architectures. For example, with a 16-bit word-addressable architecture GDB returns an opcode string such as "1234 5678" (4 octets, 2 addressable units) rather than "12 34 56 78".

The method org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.decodeOpcode(String) assumes a byte-addressable architecture.

Could we eliminate conversion of the opcode string to a Byte[] and then back to a String for presentation? In most CDT debugging views, DSF-GDB relies on GDB itself for the formatting of data.

I am concerned that we don't bake-in this assumption at API level.

Comment 11 Jonah Graham

2021-05-11 15:42:01 EDT

(In reply to John Dallaway from comment #10)
> This change is problematic for word-addressable target architectures. For
> example, with a 16-bit word-addressable architecture GDB returns an opcode
> string such as "1234 5678" (4 octets, 2 addressable units) rather than "12
> 34 56 78".
> 
> The method
> org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.
> decodeOpcode(String) assumes a byte-addressable architecture.
> 
> Could we eliminate conversion of the opcode string to a Byte[] and then back
> to a String for presentation? In most CDT debugging views, DSF-GDB relies on
> GDB itself for the formatting of data.

That is a probably the right solution. 

TCF has the data as an array of bytes, so TCF can then format the string before passing it to DSF-GDB.

> 
> I am concerned that we don't bake-in this assumption at API level.

FWIW there is no byte assumption in the API, only in the implementation. 

However there is some "pseudo-API" because the internal interface IDisassemblyDocument was intended to only share CDI and DSF originally, it is now used by TCF too. Because it is not API and TCF want to support wide range of CDT versions the TCF implementation uses reflection to access the new methods!

Comment 12 John Dallaway

2021-05-12 13:05:55 EDT

(In reply to Jonah Graham from comment #11)
> (In reply to John Dallaway from comment #10)
> > This change is problematic for word-addressable target architectures. For
> > example, with a 16-bit word-addressable architecture GDB returns an opcode
> > string such as "1234 5678" (4 octets, 2 addressable units) rather than "12
> > 34 56 78".
> > 
> > The method
> > org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.
> > decodeOpcode(String) assumes a byte-addressable architecture.
> > 
> > Could we eliminate conversion of the opcode string to a Byte[] and then back
> > to a String for presentation? In most CDT debugging views, DSF-GDB relies on
> > GDB itself for the formatting of data.
> 
> That is a probably the right solution.

@Marc Ernst, are you able to look at this issue in the short term? Unfortunately, the current code constitutes a regression in behaviour for some users. I can test any changes if necessary.

Comment 13 Marc Ernst

2021-05-18 04:40:23 EDT

Hi John, yes, I can take care of this but can start earliest end of next week.

Comment 14 Eclipse Genie

2021-05-18 13:44:02 EDT

New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180744

Comment 15 Eclipse Genie

2021-05-18 13:44:05 EDT

New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180743

Comment 16 Jonah Graham

2021-05-18 13:50:13 EDT

Created attachment 286416 [details]
ellipsis

(In reply to Marc Ernst from comment #13)
> Hi John, yes, I can take care of this but can start earliest end of next
> week.

I think end of next week is too long to wait as we need this merged by 31 May at the latest, so I have submitted a fix for this:
> https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180743

In addition, I found that the silently cutting off of extra bytes to not be a good solution, so if the opcode length is too much, the last visible character is replaced with an ellipsis. 
> https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180744

I suspect this is rare to never occurrence, to test it I temporarily changed org.eclipse.cdt.dsf.debug.internal.ui.disassembly.OpcodeRulerColumn.MAXWIDTH to a smaller value so I could obtain the attached screenshot.

Comment 17 Eclipse Genie

2021-05-19 09:33:01 EDT

Gerrit change https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180743 was merged to [master].
Commit: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/commit/?id=7743993a08c2c3394a24f2fe2795fadeb0843d4b

Comment 18 Eclipse Genie

2021-05-23 00:26:26 EDT

Gerrit change https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180744 was merged to [master].
Commit: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/commit/?id=2443cfbeff09abee51b95703d8e88f1d1af28446

Comment 19 Jonah Graham

2021-05-23 00:27:08 EDT

Regression identified in Comment 10 fixed, and minor adjustment that shows ellipsis committed.