483386 – Semantic highlighting features not implemented in the CEditor for files opened from the history

Bug 483386 - Semantic highlighting features not implemented in the CEditor for files opened from the history

Summary: Semantic highlighting features not implemented in the CEditor for files opene...

Status:	NEW

Alias:	None

Product:	CDT
Classification:	Tools
Component:	cdt-editor (show other bugs)
Version:	Next
Hardware:	PC Windows 7

Importance:	P3 enhancement (vote)
Target Milestone:	---
Assignee:	Project Inbox
QA Contact:	Jonah Graham

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-12-01 10:20 EST by Florian T.
Modified:	2020-09-04 15:22 EDT (History)
CC List:	4 users (show)

See Also:	176036 323673

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Florian T.

2015-12-01 10:20:46 EST

Steps To Reproduce: 
1. open a c or c++ file from the History:
   a. select a file
   b. right click on it
   c. Team->Show in history
   d. double click on a version)
2. Check semantic highlighting
   a. Double click on a variable: no occurrences are marked.
   b. Disabled pre-processor blocks are not grayed out
   c. Try to hit F3 on a variable/function; no jump to its definition
   d. SDK calls (like printf) are not colored; only C keywords, comments and Strings are colored

Might be related to Bug 176036 ("The CEditor does not work well with files opened from outside of the workspace").
Note that the java editor handles the semantic highlighting.

Comment 1 Jonah Graham

2015-12-01 15:27:28 EST

(In reply to Florian T. from comment #0)
> Might be related to Bug 176036 ("The CEditor does not work well with files
> opened from outside of the workspace").
> Note that the java editor handles the semantic highlighting.

It does seem quite related, of course at some point someone fixed the issue in Bug 176036. Perhaps someone in the know can identify what the missing piece of that puzzle was so that the same thing can be applied to storage sourced editors.

Comment 2 Nathan Ridge

2015-12-03 23:42:16 EST

To provide editor features like semantic highlighting, CDT needs to be able to build an AST corresponding to an editor's contents.

To build an AST, CDT needs create a model object of type ITranslationUnit based on the the editor's input. This happens in CDocumentProvider.createFileInfo().

The editor's input is represented as an IEditorInput. CDocumentProvider.createFileInfo() handles several kinds of editor inputs, but not the kind used when opening a revision of a file, which appears to be org.eclipse.team.internal.ui.history.FileRevisionEditorInput.

We can't even access this type in CDT code, but it implements an interface IStorageEditorInput, which exposes an IStorage object, which exposes an InputStream representing the file revision contents, and an IPath representing the path of the file it is a revision of.

It should be possible to create a TranslationUnit based on these two things (an InputStream and an IPath); some changes to the TranslationUnit class would be required. The IPath can be used to synthesize a "context" similar to how EditorUtility.getEditorInputForLocation() does it, and the InputStream can be usef in lieu of a file in TranslationUnit.openBuffer().

I don't have any plans to work on this, but I'm happy to provide guidance to someone who wants to give it a try.

Comment 3 Nathan Ridge

2017-12-27 04:23:49 EST

(In reply to Nathan Ridge from comment #2)
> To provide editor features like semantic highlighting, CDT needs to be able
> to build an AST corresponding to an editor's contents.
> 
> To build an AST, CDT needs create a model object of type ITranslationUnit
> based on the the editor's input.

There is an additional complication here which didn't occur to me when I wrote my previous comment.

For a file that includes other files via #include directives (which is practically all interest files in real-world code), building an accurate AST also requires an index which contains a representation of the contents of the included files, since those are not parsed directly.

For a file opened from history, this presents a problem. I could envision doing one of three things:

  (1) Build an AST that's not backed by an index.

      There are two sub-options here:

        (a) Do not parse included files when building this 
            AST. This is what we currently do if you e.g.
            disable the indexer for a project.

            This means all symbols from an included file, or 
            that depend on symbols from an included file,
            will be unresolved, and highlighted incorrectly
            as a result.

        (b) Parse included files when building this AST.

            Symbols would be resolved and highlighted
            correctly. However, building this kind of AST
            is expensive, and may delay the point where
            semantic coloring kicks in long enough that the
            feature is not useful, particularly for large
            projects.

  (2) Build an AST backed by the project's current index.

      This basically means interpreting the *historical*
      contents of the file being viewed with an index that
      reflects the *current* contents of the files it 
      includes.

      This may work well enough in practice, especially
      when the historical revision being viewed is not too
      old / too different from the current revision.
      However, the results it gives can get arbitrarily bad
      as you look farther back in time / across significant
      changes.

  (3) Build an index that reflects the contents of the 
      project at the time of the historical revision, and
      build an AST backed by that index.

      This would result in correct symbol resolution and
      highlighting, but the performance characteristics
      would be similar to (1b) or even worse, unless we
      kept around indexes for old revisions (which would
      quickly become prohibitive in terms of disk space,
      especially for large projects).

      This would also involve a great deal of 
      implementation complexity.

None of these are great solutions; (2) is probably the best compromise between accuracy, performance, and implementation complexity, but it is still very much a compromise.

> I don't have any plans to work on this, but I'm happy to provide guidance to
> someone who wants to give it a try.

(This still applies.)