<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Custom Visual Studio language services: ManagedMyC meets ANTLR</title>
	<atom:link href="http://blog.280z28.org/archives/2008/10/21/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.280z28.org/archives/2008/10/21/</link>
	<description>Because it&#039;s easier than editing the HTML by hand.</description>
	<lastBuildDate>Wed, 05 Oct 2011 19:39:44 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Manuel</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-22941</link>
		<dc:creator>Manuel</dc:creator>
		<pubDate>Fri, 18 Mar 2011 13:05:40 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-22941</guid>
		<description>&lt;p&gt;Hi Sam. can you explain me how do i create the installer for this language service?&lt;/p&gt;

&lt;p&gt;Thaks!!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Hi Sam. can you explain me how do i create the installer for this language service?</p>

<p>Thaks!!</p>]]></content:encoded>
	</item>
	<item>
		<title>By: pumR</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-7837</link>
		<dc:creator>pumR</dc:creator>
		<pubDate>Thu, 14 May 2009 14:55:46 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-7837</guid>
		<description>&lt;p&gt;Hi Guys!
The real problem for me is to get the language service running on my custom editor not on the core editor.
I hope that somebody can help me. How can I set a specific language service to e.g. a textfield?
Any links or snippets would help a lot!
Thx!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Hi Guys!
The real problem for me is to get the language service running on my custom editor not on the core editor.
I hope that somebody can help me. How can I set a specific language service to e.g. a textfield?
Any links or snippets would help a lot!
Thx!</p>]]></content:encoded>
	</item>
	<item>
		<title>By: James N</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-6000</link>
		<dc:creator>James N</dc:creator>
		<pubDate>Wed, 28 Jan 2009 17:11:43 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-6000</guid>
		<description>&lt;p&gt;Almost.  The asterisk keeps getting interpreted as special character by wordpress.  That was supposed to be &quot;When you encounter /[asterisk]&quot; and &quot;when you later encounter &quot;[asterisk]/&quot;.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Almost.  The asterisk keeps getting interpreted as special character by wordpress.  That was supposed to be &#8220;When you encounter /[asterisk]&#8221; and &#8220;when you later encounter &#8220;[asterisk]/&#8221;.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: James N</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-5999</link>
		<dc:creator>James N</dc:creator>
		<pubDate>Wed, 28 Jan 2009 17:06:22 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-5999</guid>
		<description>&lt;p&gt;Urgh, that got mangled.  Let me try again.&lt;/p&gt;

&lt;p&gt;I use MPLex&#039;s start conditions just like the ManagedMyC example does, e.g. you provide a different set of patterns that only apply when you are inside a comment, and prefix them with &lt;COMMENT&gt;.  When you encounter /&lt;strong&gt;, you put &quot;BEGIN&lt;COMMENT&gt;&quot; in your rule to enter the COMMENT start condition; when you later encounter &quot;&lt;/strong&gt;/&quot;, you use &quot;BEGIN&lt;INITIAL&gt;&quot; to end it.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Urgh, that got mangled.  Let me try again.</p>

<p>I use MPLex&#8217;s start conditions just like the ManagedMyC example does, e.g. you provide a different set of patterns that only apply when you are inside a comment, and prefix them with &lt;COMMENT&gt;.  When you encounter /<strong>, you put &#8220;BEGIN&lt;COMMENT&gt;&#8221; in your rule to enter the COMMENT start condition; when you later encounter &#8220;</strong>/&#8221;, you use &#8220;BEGIN&lt;INITIAL&gt;&#8221; to end it.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: James N</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-5998</link>
		<dc:creator>James N</dc:creator>
		<pubDate>Wed, 28 Jan 2009 17:00:21 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-5998</guid>
		<description>&lt;blockquote&gt;
  &lt;p&gt;...the ManagedMyC sample from the Visual Studio SDK... The most important thing to note at this point: many parts of this sample are inefficient, clumsy, and/or just done the wrong way.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hmm... I just made a project implementing a language service for a simple c-like language, using the ManagedMyC sample as a starting point.  Could you elaborate on what this sample does incorrectly? (though I&#039;ve found some bugs in it already).&lt;/p&gt;

&lt;p&gt;About the manual coding vs. use of predicates issue with block comments: I use MPLex&#039;s start conditions just like the ManagedMyC example does, e.g. you provide a different set of patterns that only apply when you are inside a comment, and prefix them with .  When you encounter &quot;/&lt;em&gt;&quot;, you put &quot;BEING&quot; in your rule to enter the COMMENT start condition; when you later encounter &quot;&lt;/em&gt;/&quot;, you use &quot;BEING&quot; to end it.  I don&#039;t know if ANTLR has anything like start conditions, but they were made to handle cases like this.  I also use them to handle string literals and preprocessor statements, and they work like a charm.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<blockquote>
  <p>&#8230;the ManagedMyC sample from the Visual Studio SDK&#8230; The most important thing to note at this point: many parts of this sample are inefficient, clumsy, and/or just done the wrong way.</p>
</blockquote>

<p>Hmm&#8230; I just made a project implementing a language service for a simple c-like language, using the ManagedMyC sample as a starting point.  Could you elaborate on what this sample does incorrectly? (though I&#8217;ve found some bugs in it already).</p>

<p>About the manual coding vs. use of predicates issue with block comments: I use MPLex&#8217;s start conditions just like the ManagedMyC example does, e.g. you provide a different set of patterns that only apply when you are inside a comment, and prefix them with .  When you encounter &#8220;/<em>&#8220;, you put &#8220;BEING&#8221; in your rule to enter the COMMENT start condition; when you later encounter &#8220;</em>/&#8221;, you use &#8220;BEING&#8221; to end it.  I don&#8217;t know if ANTLR has anything like start conditions, but they were made to handle cases like this.  I also use them to handle string literals and preprocessor statements, and they work like a charm.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: 280Z28</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-5896</link>
		<dc:creator>280Z28</dc:creator>
		<pubDate>Tue, 06 Jan 2009 19:21:32 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-5896</guid>
		<description>&lt;p&gt;Hi Mike,&lt;/p&gt;

&lt;p&gt;I&#039;ve done 3 different things for 3 different languages. Each one was successful (good performance) for source files of 20000+ lines / 500+ kb.&lt;/p&gt;

&lt;p&gt;UnrealScript:&lt;/p&gt;

&lt;p&gt;I&#039;m not compiling UnrealScript, so the grammar is solely used for IntelliSense purposes. The lexer rules in this grammar support the method described in this post, and the colorizer is implemented as a larger version of what&#039;s in this post.&lt;/p&gt;

&lt;p&gt;StringTemplate:&lt;/p&gt;

&lt;p&gt;I updated the lexer rules in Group.g3 to meet the colorizer requirements described in this post. I don&#039;t like this as much because the implementation of the StringTemplate library, which is completely independent of the language service, must now meet special requirements so the language service works. This type of dependency is unacceptable, so I&#039;ll be changing over to the method I use for the ANTLR v3 Grammar language service.&lt;/p&gt;

&lt;p&gt;ANTLR v3:&lt;/p&gt;

&lt;p&gt;I reference C# port of the ANTLR tool to gather IntelliSense information / full source parsing. To implement the colorizer, I copied all of the lexer rules from ANTLR.g3 and made a new AntlrColorizerLexer.g3 inside the language service. I then updated this lexer to support the colorizer. If the ANTLR &lt;em&gt;lexer&lt;/em&gt; spec changes in the future, I will have to update this lexer in the language service to reflect the changes, but I believe this is an acceptable situation.&lt;/p&gt;

&lt;p&gt;Finally, regarding the use of manual coding instead of predicates: predicates of this form greatly impact the performance of the lexer. The method for implementing a colorizer as described here offers good performance and provides easy access to the original token information from the lexer at any point in the code via the TokenInfo. The StartIndex and EndIndex give the location, and the Token member (int) holds the lexer token type.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Hi Mike,</p>

<p>I&#8217;ve done 3 different things for 3 different languages. Each one was successful (good performance) for source files of 20000+ lines / 500+ kb.</p>

<p>UnrealScript:</p>

<p>I&#8217;m not compiling UnrealScript, so the grammar is solely used for IntelliSense purposes. The lexer rules in this grammar support the method described in this post, and the colorizer is implemented as a larger version of what&#8217;s in this post.</p>

<p>StringTemplate:</p>

<p>I updated the lexer rules in Group.g3 to meet the colorizer requirements described in this post. I don&#8217;t like this as much because the implementation of the StringTemplate library, which is completely independent of the language service, must now meet special requirements so the language service works. This type of dependency is unacceptable, so I&#8217;ll be changing over to the method I use for the ANTLR v3 Grammar language service.</p>

<p>ANTLR v3:</p>

<p>I reference C# port of the ANTLR tool to gather IntelliSense information / full source parsing. To implement the colorizer, I copied all of the lexer rules from ANTLR.g3 and made a new AntlrColorizerLexer.g3 inside the language service. I then updated this lexer to support the colorizer. If the ANTLR <em>lexer</em> spec changes in the future, I will have to update this lexer in the language service to reflect the changes, but I believe this is an acceptable situation.</p>

<p>Finally, regarding the use of manual coding instead of predicates: predicates of this form greatly impact the performance of the lexer. The method for implementing a colorizer as described here offers good performance and provides easy access to the original token information from the lexer at any point in the code via the TokenInfo. The StartIndex and EndIndex give the location, and the Token member (int) holds the lexer token type.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Pagel</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-5893</link>
		<dc:creator>Mike Pagel</dc:creator>
		<pubDate>Mon, 05 Jan 2009 20:17:19 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-5893</guid>
		<description>&lt;p&gt;Hi Sam,&lt;/p&gt;

&lt;p&gt;thanks for this post. I do have a few questions, though.&lt;/p&gt;

&lt;p&gt;It seems you have introduced &quot;multi-line tokens&quot; and your way to approach them more or less in order to stick with the Babel frameworks approach to use only lexer tokens for colorization. But I believe this is not optimal:&lt;/p&gt;

&lt;p&gt;(1) Lexer tokens like &quot;Identifier&quot; will appear in different scenarios (parser rules) where they are e.g. class names or object names, which are already colored differently in VisualStudio, so there they cannot result from the same token.&lt;/p&gt;

&lt;p&gt;You therefore must solve this by adding a statemachine to the lexer, which essentially turns the regular lexer language into a context-free language. Since this is typically not supported by lexer generators you have to handcode the statemachine as done in your sample through the introduction of the state variable InBlockComment. For comments this is acceptable, but what about the class vs. object name example? You essentially would have to build parts of the AST to understand in the lexer (!) what kind of Identifier you are just scanning. That will be a lot of effort, won&#039;t it?&lt;/p&gt;

&lt;p&gt;(2) Then you add the switch/case- (&quot;if-&quot; in your sample) statement doing the evaulation of the current lexer state into the handwritten NextToken method. Now that is quite hard to maintain as the statemachine is now split into two parts: modifying actions in the lexer grammar and guards and transition detection in NextToken(). I am wondering whether ANTLR&#039;s semantic predicates would do better here.&lt;/p&gt;

&lt;p&gt;Is the approach shown in this post really scalable to support real languages of some size (whatever that means...)?&lt;/p&gt;

&lt;p&gt;I&#039;d be happy to hear about your thoughts. Thanks a lot,
Mike&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Hi Sam,</p>

<p>thanks for this post. I do have a few questions, though.</p>

<p>It seems you have introduced &#8220;multi-line tokens&#8221; and your way to approach them more or less in order to stick with the Babel frameworks approach to use only lexer tokens for colorization. But I believe this is not optimal:</p>

<p>(1) Lexer tokens like &#8220;Identifier&#8221; will appear in different scenarios (parser rules) where they are e.g. class names or object names, which are already colored differently in VisualStudio, so there they cannot result from the same token.</p>

<p>You therefore must solve this by adding a statemachine to the lexer, which essentially turns the regular lexer language into a context-free language. Since this is typically not supported by lexer generators you have to handcode the statemachine as done in your sample through the introduction of the state variable InBlockComment. For comments this is acceptable, but what about the class vs. object name example? You essentially would have to build parts of the AST to understand in the lexer (!) what kind of Identifier you are just scanning. That will be a lot of effort, won&#8217;t it?</p>

<p>(2) Then you add the switch/case- (&#8220;if-&#8221; in your sample) statement doing the evaulation of the current lexer state into the handwritten NextToken method. Now that is quite hard to maintain as the statemachine is now split into two parts: modifying actions in the lexer grammar and guards and transition detection in NextToken(). I am wondering whether ANTLR&#8217;s semantic predicates would do better here.</p>

<p>Is the approach shown in this post really scalable to support real languages of some size (whatever that means&#8230;)?</p>

<p>I&#8217;d be happy to hear about your thoughts. Thanks a lot,
Mike</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Sam&#8217;s Blog &#187; Blog Archive &#187; ManagedMyC: Type and member dropdown bars</title>
		<link>http://blog.280z28.org/archives/2008/10/21/comment-page-1/#comment-4716</link>
		<dc:creator>Sam&#8217;s Blog &#187; Blog Archive &#187; ManagedMyC: Type and member dropdown bars</dc:creator>
		<pubDate>Sun, 19 Oct 2008 22:34:13 +0000</pubDate>
		<guid isPermaLink="false">http://blog.280z28.org/archives/2008/10/21/#comment-4716</guid>
		<description>&lt;p&gt;[...] Here&#8217;s the source code for the ManagedMyC sample at this point. Since I surely missed things, you can always diff this code versus the original source from my first post on this subject. [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] Here&#8217;s the source code for the ManagedMyC sample at this point. Since I surely missed things, you can always diff this code versus the original source from my first post on this subject. [...]</p>]]></content:encoded>
	</item>
</channel>
</rss>

