Bug 572880 - Disassembly should show opcodes as byte sequence
Summary: Disassembly should show opcodes as byte sequence
Status: RESOLVED FIXED
Alias: None
Product: CDT
Classification: Tools
Component: cdt-debug-hardware (show other bugs)
Version: Next   Edit
Hardware: PC Windows 10
: P3 enhancement with 7 votes (vote)
Target Milestone: 10.3.0   Edit
Assignee: Marc Ernst CLA
QA Contact: John Dallaway CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-15 08:37 EDT by Marc Ernst CLA
Modified: 2021-05-23 00:27 EDT (History)
7 users (show)

See Also:


Attachments
ellipsis (16.60 KB, image/png)
2021-05-18 13:50 EDT, Jonah Graham CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Ernst CLA 2021-04-15 08:37:56 EDT
Opcodes should be shown as a byte sequence instead of hex. It might be good to have also an option to switch between both display modes. GDB supports the opcode as byte sequence with disass /r. In addition users should be able to modify the single bytes.

If there is an agreement if and how this can be implemented I can help with creating a patch.
Comment 1 Jonah Graham CLA 2021-04-15 09:34:39 EDT
Hi Marc,

On first pass that sounds completely reasonable. This bug seems to cover two different things:
1- how opcodes (bytes) are displayed/rendered
2- add ability to edit opcodes (bytes)

I think there is a definite yes to it being implemented. The how is a bit more of a conversation that I will start to kick off here with some questions/pointers/comments:

- CDT currently received the raw opcodes from GDB[1] and converts them into a BigInteger, so it seems reasonable to keep them in their original form and pass that up to the UI.

- Other than the alignment issue (I personally file disass /r hard to read because insns are not aligned) I don't see much value in providing the option to display the BigInteger version of the opcodes in the UI. The opcodes are displayed using rulers, so the alignment probably comes for free.

- The DisassemblyView is (AFAIK) a read-only UI, so adding editing functionality may be challenging to provide inline. I don't know if there is a way to have a ruler have an editable component. Having a pop-up should be possible though.

- The API around DisassemblyView is complicated because of the async nature and because it used to support two different backends (DSF and now deleted CDI). This means a lot of stuff is listed as API that probably shouldn't be. So there will be some discussions on how to mitigate that.

- MIDataDisassemble (the wrapper for MI's -data-disassemble) has not been updated in a while. A few releases back GDB added new modes (4 and 5). I don't know if you want one of the improved modes to achieve this fully.

I hope that is enough for you to get started on an implementation/patch. To make it easier to review I think that the display and editing aspects should be split into (at least) two commits, they can be dependent on each other if needed. Please let me know if you have any questions.

Thanks!
Jonah


[1] https://git.eclipse.org/r/plugins/gitiles/cdt/org.eclipse.cdt/+/refs/heads/master/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/output/MIInstruction.java#158
Comment 2 Marc Ernst CLA 2021-04-16 07:03:55 EDT
Hi Jonah,
thanks a lot for your fast response and the really helpful information. I can work now on a first patch for showing the bytes in the disassembly view.

Kind regards,
Marc
Comment 3 Eclipse Genie CLA 2021-04-21 01:14:51 EDT
New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/179579
Comment 4 Eclipse Genie CLA 2021-04-21 01:17:06 EDT
New Gerrit change created: https://git.eclipse.org/r/c/tcf/org.eclipse.tcf/+/179580
Comment 7 Jonah Graham CLA 2021-04-23 16:15:17 EDT
This is all merged now. Just needs a N&N entry with a screenshot showing the nice new UI. https://wiki.eclipse.org/CDT/User/NewIn103

@Marc, are you able to write the N&N? Please create a new bug for the follow on work of editing opcodes.
Comment 8 Marc Ernst CLA 2021-04-26 02:04:03 EDT
Hello Jonah,
I've created a new bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=573157 for the editing functionality. In addition I've created a short N&N which is in review currently.

Thanks,
Marc
Comment 9 Jonah Graham CLA 2021-04-26 09:21:19 EDT
Thanks Marc - I like the new and improved look of that ruler column!

Someone else got to the moderation queue before me, hopefully they have marked your changes to be auto-approved in the future, if not please ping me and I will do that.
Comment 10 John Dallaway CLA 2021-05-11 14:18:05 EDT
This change is problematic for word-addressable target architectures. For example, with a 16-bit word-addressable architecture GDB returns an opcode string such as "1234 5678" (4 octets, 2 addressable units) rather than "12 34 56 78".

The method org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.decodeOpcode(String) assumes a byte-addressable architecture.

Could we eliminate conversion of the opcode string to a Byte[] and then back to a String for presentation? In most CDT debugging views, DSF-GDB relies on GDB itself for the formatting of data.

I am concerned that we don't bake-in this assumption at API level.
Comment 11 Jonah Graham CLA 2021-05-11 15:42:01 EDT
(In reply to John Dallaway from comment #10)
> This change is problematic for word-addressable target architectures. For
> example, with a 16-bit word-addressable architecture GDB returns an opcode
> string such as "1234 5678" (4 octets, 2 addressable units) rather than "12
> 34 56 78".
> 
> The method
> org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.
> decodeOpcode(String) assumes a byte-addressable architecture.
> 
> Could we eliminate conversion of the opcode string to a Byte[] and then back
> to a String for presentation? In most CDT debugging views, DSF-GDB relies on
> GDB itself for the formatting of data.

That is a probably the right solution. 

TCF has the data as an array of bytes, so TCF can then format the string before passing it to DSF-GDB.

> 
> I am concerned that we don't bake-in this assumption at API level.

FWIW there is no byte assumption in the API, only in the implementation. 

However there is some "pseudo-API" because the internal interface IDisassemblyDocument was intended to only share CDI and DSF originally, it is now used by TCF too. Because it is not API and TCF want to support wide range of CDT versions the TCF implementation uses reflection to access the new methods!
Comment 12 John Dallaway CLA 2021-05-12 13:05:55 EDT
(In reply to Jonah Graham from comment #11)
> (In reply to John Dallaway from comment #10)
> > This change is problematic for word-addressable target architectures. For
> > example, with a 16-bit word-addressable architecture GDB returns an opcode
> > string such as "1234 5678" (4 octets, 2 addressable units) rather than "12
> > 34 56 78".
> > 
> > The method
> > org.eclipse.cdt.debug.internal.ui.disassembly.dsf.DisassemblyUtils.
> > decodeOpcode(String) assumes a byte-addressable architecture.
> > 
> > Could we eliminate conversion of the opcode string to a Byte[] and then back
> > to a String for presentation? In most CDT debugging views, DSF-GDB relies on
> > GDB itself for the formatting of data.
> 
> That is a probably the right solution.

@Marc Ernst, are you able to look at this issue in the short term? Unfortunately, the current code constitutes a regression in behaviour for some users. I can test any changes if necessary.
Comment 13 Marc Ernst CLA 2021-05-18 04:40:23 EDT
Hi John, yes, I can take care of this but can start earliest end of next week.
Comment 14 Eclipse Genie CLA 2021-05-18 13:44:02 EDT
New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180744
Comment 15 Eclipse Genie CLA 2021-05-18 13:44:05 EDT
New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180743
Comment 16 Jonah Graham CLA 2021-05-18 13:50:13 EDT
Created attachment 286416 [details]
ellipsis

(In reply to Marc Ernst from comment #13)
> Hi John, yes, I can take care of this but can start earliest end of next
> week.

I think end of next week is too long to wait as we need this merged by 31 May at the latest, so I have submitted a fix for this:
> https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180743

In addition, I found that the silently cutting off of extra bytes to not be a good solution, so if the opcode length is too much, the last visible character is replaced with an ellipsis. 
> https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180744

I suspect this is rare to never occurrence, to test it I temporarily changed org.eclipse.cdt.dsf.debug.internal.ui.disassembly.OpcodeRulerColumn.MAXWIDTH to a smaller value so I could obtain the attached screenshot.
Comment 19 Jonah Graham CLA 2021-05-23 00:27:08 EDT
Regression identified in Comment 10 fixed, and minor adjustment that shows ellipsis committed.