Django Editor in VS 2010 – Part 2 (Background Parsing)
Look here for complete source
Every time the text in the editor window changes, all your classification, colorization, tagging etc. has to be re-evaluated. If you are lucky you can just take theВ text updates and run them through your parser. If it takes just a few statements to parse the updates, you do not need to read the rest of this post. If your parser is really that simple (or smart) then all you need in your classifier, tagger and/or quickinfo controller is an event handler attached to the ITextBuffer's Changed event:
buffer.Changed += new EventHandler(buffer_Changed);
and then run the changes through your parser right in the event handler. You have to keep in mind though, that this code will be executed almost on each and every keystroke in your edit window.
So, if you are like the rest of us andВ have to worry about making the editor too sluggish to be useful, you might want toВ have a look atВ a few tricks which helped me to address this problem.В Actually I haveВ three:
- Make sure that no matter how many classifiers, taggers, whatever need results of parsing, all changes are ran through the parser only once
- Make parsing asynchronous - do not make editor wait for the parser to finish parsing.
- Queue the parsing - delay parsing for a second or so. This way if the update notifications are coming really fast you will still run the parser no more often than once a second
At this time I do not have an implementation for the last one, but the overall structure of what I have lends itself wellВ to parsing requests queuing.
ToВ cover the firstВ partВ I created my own interface INodeProviderBroker. I also wrote a class implementing the interface and exported it as a MEF component:
internal interface INodeProviderBroker { NodeProvider GetNodeProvider(ITextBuffer buffer); bool IsNDjango(ITextBuffer buffer); } [Export(typeof(INodeProviderBroker))] internal class NodeProviderBroker : INodeProviderBroker { //the real parser IParser parser = new Parser(); public bool IsNDjango(ITextBuffer buffer) { switch (buffer.ContentType.TypeName) { case "text": case "HTML": return true; default: return false; } } public NodeProvider GetNodeProvider(ITextBuffer buffer) { NodeProvider provider; if (!buffer.Properties.TryGetProperty(typeof(NodeProvider), out provider)) buffer.Properties.AddProperty(typeof(NodeProvider), provider = new NodeProvider(parser, buffer)); return provider; } }
Each tagger, classifier what have you can import the provider broker and use the GetNodeProvider method to get a node provider for a given buffer. As you can see, the way GetNodeProvider works isВ it creates a new NodeProviderВ only once - upon first request for a given buffer. From this moment on, the GetNodeProvider will return the existing node provider taken from the buffer's property bag. Using MEF in this particular case might have been an overkill, but I decided to follow the lead of how this sort of things is done in VsEditor.
As to the second part, let me step you throughВ the code for the node provider:
class NodeProvider { private List<NodeSnapshot> nodes = new List<NodeSnapshot>(); private object node_lock = new object(); private IParser parser; private ITextBuffer buffer; public NodeProvider(IParser parser, ITextBuffer buffer) { this.parser = parser; this.buffer = buffer; rebuildNodes(buffer.CurrentSnapshot); buffer.Changed += new EventHandler(buffer_Changed); } public delegate void SnapshotEvent (SnapshotSpan snapshotSpan); void buffer_Changed(object sender, TextContentChangedEventArgs e) { rebuildNodes(e.After); } private void rebuildNodes(ITextSnapshot snapshot) { ThreadPool.QueueUserWorkItem(rebuildNodesAsynch, snapshot); } public event SnapshotEvent NodesChanged; private void rebuildNodesAsynch(object snapshotObject) { ITextSnapshot snapshot = (ITextSnapshot)snapshotObject; List<NodeSnapshot> nodes = parser.Parse(snapshot.Lines.ToList().ConvertAll(line => line.GetTextIncludingLineBreak())) .ToList() .ConvertAll<NodeSnapshot> (node => new NodeSnapshot(snapshot, node)); lock (node_lock) { this.nodes = nodes; } if (NodesChanged != null) NodesChanged(new SnapshotSpan(snapshot, 0, snapshot.Length)); } internal List<NodeSnapshot> GetNodes(SnapshotSpan snapshotSpan) { List<NodeSnapshot> nodes; lock (node_lock) { nodes = this.nodes; } if (nodes.Count == 0) return nodes; // just in case if while the tokens list was being rebuilt // another modification was made if (this.nodes[0].SnapshotSpan.Snapshot != snapshotSpan.Snapshot) this.nodes.ForEach(node => node.TranslateTo(snapshotSpan.Snapshot)); return nodes; } internal List<INode> GetNodes(SnapshotPoint point) { List<NodeSnapshot>result = GetNodes(new SnapshotSpan(point.Snapshot, point.Position, 0)) .FindAll(node => node.SnapshotSpan.IntersectsWith(new SnapshotSpan(point.Snapshot, point.Position, 0))); if (result == null) return null; return result.ConvertAll(node => node.Node); } }
In the constructor (lines 8-14) the node provider submits a request to rebuild nodes with a call to the rebuildNodes method and subscribes to the buffer onChanged event. Within the event handler (line 20) it calls the same method. The rebuildNodes method uses thread pool to queue an asynchronous call to rebuildNodesAsynch. The rebuildNodesAsynch (lines 30 - 43) does the heavy lifting of theВ text parsing, butВ runs it on a separate thread. When parsing is completed it fires the NodesChanged event to inform the world that there is a fresh parsing result reflecting the latest updates.
Methods GetNode and GetNodes can be called at any moment. They will return the syntax node (list of syntaxВ nodes) as of the lastВ parsing completed by the momentВ the callВ was made regardless of whetherВ there is aВ parsingВ which is not completed yet.
BTW, this interaction between rebuildNodes and rebuildNodesAsynch is a good point to insert an implementation for trick #3 - parsing requests queuing. The rest of the code does not have to be changed.
If you go back now to Part 1 you will see how the node provider is used to retrieveВ parsing results. In the next post I will show how it is used with some other consumers.
September 22nd, 2009 at 2:42 pm
[...] already discussed the parsing speed issue in the Part 2 Background Parsing of the editor series. I listed there some things which can be done to address the speed issue. I [...]