DOT NET Compiler API:Practical Compiler API: Diagnostics, Refactoring, and Scripting

This document functions as a technical guide focusing on the .NET Compiler API, also known as Roslyn. It primarily explains how developers can leverage this open-source API to perform code analysis by writing custom diagnostics and refactorings for C# applications. The text further explores the Scripting API, a component of Roslyn, demonstrating how it enables C# to function as a dynamic scripting language. Throughout the sources, there’s an emphasis on practical implementation, including details on testing, debugging, and deployment of these compiler-driven tools, along with a discussion of future possibilities for C# powered by this API.

Mastering the .NET Compiler API: Roslyn Revealed

The Compiler API, also known by its code name Project Roslyn, is a new infrastructure from Microsoft that opens up the internal workings of the .NET compilation pipeline via a public .NET API. This marks a significant departure from the traditional model where the .NET compiler was a monolithic executable with no public APIs exposed, often referred to as a “closed box”.

Purpose and Evolution: Historically, compilers were seen as a “closed box,” where developers provided file paths and optional switches, and the compiler produced an executable. There was no way to “plug into” the compiler’s pipeline, augment the process, or use its functionality outside of compilation. This led to issues like inconsistency among code analysis tools that had to duplicate compilation logic and a lack of direct community involvement in shaping the language.

The Compiler API addresses these issues by:

  • Providing public access to compiler functionality within any .NET application.
  • Enabling tools for code analysis and allowing developers to perform code generation and dynamic compilation in their applications.
  • Promoting an open standard that everyone can use.
  • Making the source code freely available for anyone to read and contribute to, fostering a strong community around the .NET compilation system.

Core Components and Concepts:

  1. Syntax Trees:
  • The fundamental API data structure used by the Compiler API.
  • They represent the textual content of code and how that content relates to C#.
  • Even for small pieces of C# code, these trees can become quite large.
  • The process of compiling code involves creating a tree from text.
  • You can build your own trees from scratch using the SyntaxFactory class along with the SyntaxTree class.
  • Trees are immutable; when you “modify” a tree, you actually get a new node or tree back, with the original remaining unchanged. This design aids in easy comparison between nodes and efficient memory management.
  1. Syntax Nodes, Tokens, and Trivia:
  • Within a Compiler API tree, there are three essential base types:
  • SyntaxNode: An abstract class that can contain other tree types, directly or indirectly (e.g., ClassDeclarationSyntax, MethodDeclarationSyntax).
  • SyntaxToken: A struct that defines a termination in the tree, representing elements like keywords, identifiers, and braces. Its Kind property uses a SyntaxKind enumeration to specify the type of token.
  • SyntaxTrivia: Also structs, these represent the “unimportant” parts of code such as spaces, tabs, and end-of-line characters. While they don’t affect execution, they are crucial for preserving code formatting and developer style.
  1. Semantic Models:
  • While syntax trees understand the textual structure, a semantic model provides deeper meaning to tokens.
  • It offers a layer on top of the syntax tree to provide information that is not easily inferred from syntax alone, such as type names, whether a class is sealed, or if an argument is passed by reference.
  • Obtaining a semantic model requires a compilation object and involves extra work, which may incur a small performance cost.

Key Functionality and Capabilities:

  • Compiling Code: The API allows for compiling C# code on the fly into .NET assemblies (Intermediate Language with metadata). This involves parsing code to create a syntax tree, then compiling that tree using a CSharpCompilation object.
  • Creating Code Using Trees: Developers can build tree structures directly, or use tools like RoslynQuoter to generate the necessary CompilationUnitSyntax objects based on C# code snippets.
  • Navigating and Editing Trees:Navigation: You can find content within a tree using “Descendant” methods (e.g., DescendantNodes()) on a node to find specific information. Alternatively, “walker” classes like CSharpSyntaxWalker can be used to visit every node within a tree.
  • Editing: Although trees are immutable, you can create “modified” trees using “replace” methods (e.g., ReplaceNodes(), ReplaceTokens(), ReplaceTrivia()) or by using “rewriters” that inherit from CSharpSyntaxRewriter.
  • Annotations and Formatters:Annotations (SyntaxAnnotation): Allow you to mark nodes and tokens with custom information for later retrieval without affecting the compiled output or printed code.
  • Formatters: The API provides ways to apply “common” C# formatting (e.g., NormalizeWhitespace()) or use workspaces to define how code should be formatted, which is particularly valuable for code fixes.

Tools for Development:

  • Syntax Visualizer: A crucial tool included with the .NET Compiler SDK that allows developers to visualize the full syntax tree for any given C# code in Visual Studio. It helps in understanding node structures and identifying errors within the tree.
  • RoslynQuoter: An online tool (or local code base) that generates CompilationUnitSyntax objects from C# code snippets, significantly easing the manual process of building trees.

Applications of Compiler API: The Compiler API empowers developers to create sophisticated tools directly integrated with the compilation process. This includes:

  • Diagnostics: Analyzers that identify problematic issues in code that the C# compiler itself might not catch, often with associated code fixes to automate corrections.
  • Refactorings: Tools that allow developers to restructure code without altering its external behavior, improving its internal structure and consistency.
  • Scripting API: Allows C# to be treated as a scripting language, enabling dynamic capabilities and interactive programming experiences (e.g., C# Interactive window, csi.exe).
  • Code Generation: Used in frameworks like Rocks for generating mock objects at runtime, and in build tools like Cake for defining build steps using a C#-like DSL.
  • Future C# Features: The API lays the groundwork for potential future C# features like “source generators” which could automatically weave common code implementations (e.g., INotifyPropertyChanged logic) into classes via compile-time attributes.

Creating .NET Compiler API Diagnostics and Code Fixes

Diagnostics and Code Fixes are powerful features within the Compiler API that allow developers to identify and automatically correct problematic issues in their code that the standard C# compiler might not catch. They enable developers to enforce coding standards, desired idioms, and framework expectations, providing immediate feedback and automated solutions.

The Need to Diagnose Compilation

Traditionally, developers often had to wait until compilation or even runtime to discover certain issues. The C# compiler, being a “closed box,” lacked public APIs to “plug into” its pipeline and augment the process, meaning tools for code analysis had to duplicate compilation logic, leading to inconsistencies. The Compiler API, or Project Roslyn, addresses this by opening up the internal workings of the .NET compilation pipeline via a public .NET API [Roslyn API, 45].

Diagnostics provide a “fail fast” mechanism, allowing issues to be found as soon as the code is typed in Visual Studio. This is crucial for problems that the C# compiler doesn’t know about, such as:

  • Enforcing specific API usage (e.g., using DateTime.UtcNow instead of DateTime.Now).
  • Ensuring classes adhere to specific contracts (e.g., all classes inheriting from a base class must be serializable).
  • Validating attribute values (e.g., checking TimeSpan formatting in a string attribute).
  • Preventing required base method invocations from being omitted in overridden methods.

Designing the Diagnostic

Before implementation, it’s vital to have a clear understanding of the problem and how it manifests in the code’s syntax tree. The Syntax Visualizer, a tool included with the .NET Compiler SDK, is invaluable here. It allows developers to visualize the full syntax tree for any given C# code in Visual Studio, helping to understand node structures and identify errors within the tree.

For example, when designing a diagnostic to ensure an overridden method calls its base implementation (if the base method is marked with [MustInvoke]), the Syntax Visualizer helps pinpoint the relevant nodes, such as IdentifierNameSyntax and InvocationExpressionSyntax. This process determines that the analyzer needs to check if a method is an override, if its overridden method has the [MustInvoke] attribute, and if there’s at least one invocation of that base method within the overridden method’s definition.

Creating a Diagnostic

  1. Project Setup: The Analyzer with Code Fix (NuGet + VSIX) template in Visual Studio creates a solution with three projects:
  • [ProjectName].Analyzers: A Portable Class Library (PCL) where the analyzer and code fixes are defined. PCLs have a limited set of APIs, which can restrict logic.
  • [ProjectName].Test: An MSTest-based project for unit testing the diagnostic.
  • [ProjectName].Vsix: A VS Package-based project that references the analyzer and allows for quick testing in a new Visual Studio instance. It’s recommended to separate the analyzer code from the code being analyzed, such as putting custom attributes (like [MustInvoke]) in a separate assembly.
  1. Diagnostic Class Setup:
  • The analyzer class must be decorated with the [DiagnosticAnalyzer(LanguageNames.CSharp)] attribute and inherit from DiagnosticAnalyzer.
  • It must override SupportedDiagnostics, returning an ImmutableArray of DiagnosticDescriptor objects. Each DiagnosticDescriptor defines characteristics like an identifier (e.g., “MUST0001”), title, message format, category, and severity (DiagnosticSeverity.Error for red squiggle, Warning for yellow squiggle).
  • It must override Initialize(AnalysisContext context), where you inform the Compiler API engine which types of nodes you want to analyze (e.g., RegisterSyntaxNodeAction for MethodDeclaration nodes).
  1. Analyzing Code:
  • The analysis logic resides in the method registered in Initialize (e.g., AnalyzeMethodDeclaration).
  • It’s crucial to frequently call context.CancellationToken.ThrowIfCancellationRequested() to ensure a responsive Visual Studio experience, allowing the analysis to exit if a cancellation is requested.
  • Semantic Models (IMethodSymbol) are used to provide deeper meaning to tokens, such as determining if a method is an override, finding the overridden method, and checking for attributes like [MustInvoke].
  • DescendantNodes() can be used to find specific elements within the syntax tree, such as InvocationExpressionSyntax nodes, to check for base method calls.
  • If a violation is found, context.ReportDiagnostic(Diagnostic.Create(…)) is called, specifying the DiagnosticDescriptor and the location (GetLocation()) where the error should be squiggled in Visual Studio.

Providing Code Fixes

Code fixes provide automatic ways to correct detected issues, complementing diagnostics.

  1. Class Definition:
  • A code fix class must be decorated with [ExportCodeFixProvider(LanguageNames.CSharp)] and [Shared] attributes and inherit from CodeFixProvider.
  • It must implement FixableDiagnosticIds, returning an ImmutableArray of strings that match the diagnostic identifiers it can fix (e.g., “MUST0001”).
  • It should override GetFixAllProvider() to return WellKnownFixAllProviders.BatchFixer if you want Visual Studio to apply fixes across a document, project, or solution.
  • The core logic is in RegisterCodeFixesAsync(CodeFixContext context), an async method.
  1. Implementation Details:
  • Retrieve the relevant MethodDeclarationSyntax node using root.FindNode(diagnostic.Location.SourceSpan) and its IMethodSymbol from the semantic model.
  • Trees are immutable in the Compiler API. When “modifying” a tree, you actually get a new node or tree back. This design aids in easy comparison between nodes and efficient memory management.
  • The fix involves generating new SyntaxNode objects (e.g., InvocationExpressionSyntax for a base method call) using SyntaxFactory methods like InvocationExpression() and MemberAccessExpression().
  • Arguments are added to the invocation, handling ref or out keywords and comma separation.
  • A StatementSyntax node is created to encapsulate the invocation, potentially with a var declaration for return values. This involves generating a safe, unique local variable name (e.g., onInitializeResult, onInitializeResult0).
  • The Formatter.Annotation can be added to new nodes (WithAdditionalAnnotations) to let the code fix engine handle formatting based on Visual Studio rules.
  • Finally, context.RegisterCodeFix() is called with a CodeAction that defines the fix’s description and the function to apply the changes to the Solution.
  1. Parsing Statements vs. Building Trees:
  • You can build syntax trees manually using SyntaxFactory methods, but this can be tedious.
  • Alternatively, for simpler code fixes, you can generate the desired code as a string and use SyntaxFactory.ParseStatement() (or ParseExpression(), ParseArgumentList()) to get a StatementSyntax node directly, often resulting in much less code.

Executing Diagnostics and Code Fixes

Once the VSIX project is set as the startup project and run, it launches a separate instance of Visual Studio with the analyzer and code fix installed as an extension. When code violating the diagnostic rule is typed, a red squiggle appears. Pressing Ctrl + . (period) on the problematic code brings up the code fix window, showing a diff view of the proposed changes and allowing the developer to apply the fix for a selected scope (document, project, or solution).

Debugging Diagnostics

  • Unit Testing: Crucial for complex compiler code. The diagnostic project template provides an MSTest-based project with helper code. Tests typically:
  • Load C# source code from a file (File.ReadAllText).
  • Create a Document instance (often via an AdhocWorkspace for testing, which differs from Visual Studio’s VisualStudioWorkspace).
  • Compile the project with the analyzer using WithAnalyzers() and retrieve diagnostics (GetAnalyzerDiagnosticsAsync()).
  • Assert on the number of diagnostics and their properties (e.g., Id, Location.SourceSpan).
  • For code fixes, they simulate the CodeFixContext, invoke RegisterCodeFixesAsync(), and then verify the CodeAction produces the expected ChangedSolution or NewText.
  • VSIX Installation for Debugging: Running the VSIX project launches an experimental Visual Studio instance where breakpoints can be set in the analyzer/fix code. Be aware that code may stop if CancellationToken is used, and Visual Studio may call code from different threads. If updates don’t appear, uninstalling and reinstalling the extension can resolve issues.
  • Visual Studio Logging: If a code fix crashes, Visual Studio disables it and shows a “yellow bar of death”. To get diagnostic information, launch Visual Studio with the /log command-line switch, which writes logging to ActivityLog.xml (location varies based on experimental mode).

Deploying and Installing Diagnostics

There are two primary options for deploying and installing diagnostics for other developers:

  1. VSIX Packaging:
  • The generated .vsix file can be published via email, file servers, or the Visual Studio Gallery. Double-clicking the .vsix file initiates an automatic installation process.
  • Errors reported by a VSIX-installed diagnostic will NOT cause a build to fail; they only appear in the Error window.
  • A VSIX-installed diagnostic runs for every project loaded in Visual Studio, which is suitable for broad, team-wide standards.
  1. NuGet Packaging:
  • Analyzers can be published as NuGet packages. The analyzer project template typically creates the necessary files (.nuspec, PowerShell scripts).
  • Errors reported from a NuGet-installed diagnostic WILL cause a build to fail.
  • NuGet installation is per-project, meaning the diagnostic is only active in projects where the package is installed. This offers more fine-grained control, especially for framework-specific diagnostics.

Mastering Roslyn: Custom Refactorings and Workspaces

Refactorings and Workspaces are integral components of the .NET Compiler API (also known as Project Roslyn) that empower developers to enhance code structure and automate code modifications. While diagnostics identify issues, refactorings provide automatic ways to improve the internal structure of code without altering its external behavior. Workspaces, on the other hand, provide the underlying model for representing and interacting with code projects and solutions.

Understanding Refactorings

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. This means that while the code’s functionality remains the same, its organization, readability, and maintainability are enhanced. Visual Studio offers various built-in refactorings, such as “Extract Method” to encapsulate code into a new method, “Rename” to consistently change member names, and “Remove Unnecessary Usings” to clean up using directives.

However, the Compiler API allows developers to define their own custom refactorings, enabling them to introduce specific improvements or adhere to unique coding standards that are not covered by standard tools.

Designing and Creating a Custom Refactoring

Before implementing a refactoring, it’s crucial to understand the problem and how it manifests in the code’s syntax tree. The Syntax Visualizer is an invaluable tool for this, helping to identify the specific nodes and structures involved. For instance, a refactoring to move types from a single file into their own separate files would need to consider nested types, namespace-to-folder mapping, and the inclusion of only necessary using statements in the new files.

The process of creating a refactoring generally involves:

  1. Project Setup: Using the “Code Refactoring (NuGet + VSIX)” template in Visual Studio to create a solution with an analyzer project (a Portable Class Library for defining the refactoring) and a VSIX project (for testing in an experimental Visual Studio instance). Unlike diagnostics, the template does not automatically create a test project for refactorings.
  2. Refactoring Class Definition:
  • The refactoring class must be decorated with the [ExportCodeRefactoringProvider(LanguageNames.CSharp)] and [Shared] attributes and inherit from CodeRefactoringProvider.
  • It must override the ComputeRefactoringsAsync(CodeRefactoringContext context) method, which is where the logic for detecting applicable refactorings and registering them with Visual Studio resides.
  1. Implementing the Fix:
  • Refactorings work by creating new SyntaxNode objects or entire new trees, as the trees in the Compiler API are immutable. Methods like RemoveNodes() and WithDocumentSyntaxRoot() are used to generate the desired changes.
  • The implementation will involve:
  • Identifying top-level types to move, ensuring they are not nested or already in a file matching their name.
  • Generating necessary using directives for each moved type using semantic models to understand symbol information and their containing namespaces.
  • Constructing new CompilationUnitSyntax objects for each type, potentially creating new folders based on namespace conventions, and adding them to the project.
  • Removing the moved types and their irrelevant using directives from the original file.
  • SyntaxFactory methods can be used to manually build syntax trees, or for simpler cases, SyntaxFactory.ParseStatement() can parse a string of code directly into a StatementSyntax node, which often results in less code.

Executing and Debugging Refactorings

  • Execution in Visual Studio: Unlike diagnostics, refactorings do not execute automatically or show immediate visual indicators. They are invoked by the developer placing the cursor on a piece of code and pressing Ctrl + . (period), which brings up a context menu with available refactoring options. Visual Studio often provides a diff view of the proposed changes before they are applied.
  • Unit Testing: Unit testing is crucial for ensuring the correctness and stability of complex compiler code. Although the refactoring template doesn’t include a test project, it’s recommended to add one and use helper methods (like TestHelpers.TestProvider) to simulate the Visual Studio environment for testing CodeAction generation and the resulting ChangedSolution.
  • VSIX Installation for Debugging: Running the VSIX project launches an experimental instance of Visual Studio, where breakpoints can be set in the refactoring code. This allows developers to step through the execution of their refactoring logic in a live environment. If updates to the refactoring code are not reflected, uninstalling and reinstalling the extension in the experimental instance can help.
  • Deployment: For refactorings, VSIX packaging is currently the only deployment option provided by the default template, allowing them to be shared via .vsix files or the Visual Studio Gallery.

Interacting with Workspaces

A Workspace provides an abstraction over the traditional solution-project-document structure that .NET developers are accustomed to in Visual Studio. It models a Solution containing Project objects, which in turn contain Document objects. This object model allows tools to analyze and modify code across an entire solution.

There are three common implementations of the Workspace API:

  • AdhocWorkspace: Used for quickly creating a workspace programmatically, primarily in testing scenarios. It provides a lightweight way to set up a Solution, Project, and Document for analysis or modification.
  • MSBuildWorkspace: Used when interacting with an MSBuild process. This is suitable for scenarios where code changes need to be applied during a build.
  • VisualStudioWorkspace: The workspace used when your analyzer or refactoring is running within Visual Studio itself.

Workspaces are critical for automating code updates beyond manual refactorings. For example, a “Comment Remover” refactoring can be automatically applied using MSBuildWorkspace in a command-line tool or a custom MSBuild task, or via VisualStudioWorkspace in a Visual Studio extension that listens for document save events. When changes are made through a workspace, methods like TryApplyChanges() are called to commit modifications to the solution. It’s important to remember that trees are immutable, so any “modification” returns a new node or tree, which then needs to be applied back to the Solution or Document via the workspace.

C# Scripting: Dynamic Applications and Security Considerations

The Scripting API, introduced with Update 1 of Visual Studio 2015 as part of the .NET Compiler API (Project Roslyn), enables C# to be treated as a scripting language. This provides a dynamic way to augment applications, offering capabilities that were previously unavailable to C# developers.

What is a Scripting Language?

Traditionally, scripting languages have been seen as “glue” languages. They are often simpler than other programming languages and are designed to extend a given system by orchestrating different parts and members to create new functionality. This bypasses the typical compile, test, and deploy scenarios of most applications. Well-known examples include Bash, Python, Lua, and Visual Basic for Applications (VBA) for controlling Office applications programmatically.

A common characteristic of scripting languages is their dynamic nature, where the notion of types can be loose or even non-existent, and types can change during execution. While C# maintains its strong typing semantics even in a scripting environment, the key is that a scripting language allows for a dynamic user experience, typically through a Read, Evaluate, Print, Loop (REPL).

Using the C# REPL (Interactive Window)

The C# Interactive window in Visual Studio is a REPL that leverages the Scripting API.

  • It can be opened via “View ➤ Other Windows ➤ C# Interactive window” in Visual Studio, and does not require an open project.
  • It supports simple arithmetic calculations, variable assignment, and Intellisense, recognizing variables and their types within the interactive session.
  • Strong typing is enforced, meaning a variable initially assigned an int cannot later be assigned a string.
  • Commands like #help list available session commands, #cls clears the screen, and #reset clears the current script state.
  • Developers can define types (like classes) directly within the session, which then become usable.
  • The interactive experience is also available from the command line by typing csi in the Developer Command Prompt for VS2015.
  • Code assets can be loaded:
  • The #r directive loads references to other assemblies using their full path.
  • using statements can be included in the session to reference namespaces.
  • Script code can be saved to a file (manually) and then loaded at any time using the #load directive.

Making C# Interactive (Programmatic API)

The Microsoft.CodeAnalysis.Scripting NuGet package provides the API for programmatic scripting.

Evaluating Scripts

  • The CSharpScript class is central to scripting programmatically.
  • CSharpScript.EvaluateAsync(code) is used to execute simple C# code.
  • Errors during evaluation result in a CompilationErrorException, which has a Diagnostics property to identify issues.
  • To allow scripts to use types from other assemblies, a ScriptOptions object can be passed to EvaluateAsync(), using AddReferences() to reference assemblies and AddImports() to add using statements for namespaces, so developers don’t need to provide full type names.
  • An instance of an object can be provided to the script as globals, allowing the script to use its members.

Analyzing Scripts

  • Instead of immediate execution, CSharpScript.Create(code) can be used to obtain a Script<T> object.
  • From this Script<T> object, compilation information can be accessed via GetCompilation(), which returns a Compilation object (the base class for CSharpCompilation).
  • This allows developers to examine Diagnostics, SyntaxTrees, and SemanticModels before running the script. For example, syntax errors can be detected and reported without executing the script.

State Management in Scripts

  • The ScriptState class returned by RunAsync() helps retain information across multiple script executions.
  • Subsequent script code can then be run using state.ContinueWithAsync(code), allowing new script lines to reference variables and classes defined in previous executions.
  • A shared global context object can also be used to store and load values across script executions, though values are stored as object and require casting upon retrieval.

Concerns with the Scripting API

While powerful, the Scripting API carries important considerations regarding performance, memory usage, and security.

Performance and Memory Usage

  • There’s a cost associated with using scripts. Continuously generating and executing thousands of simple C# mathematical statements shows that the working set size and execution time slowly increase over time.
  • In comparison, using System.Linq.Expressions to dynamically generate and execute code offers stable working set sizes and significantly faster performance (three orders of magnitude faster for the demonstrated example).
  • However, the Scripting API’s strength lies in orchestrating other code pieces and its exploratory nature (like with a REPL), rather than high-frequency execution. It can also create new classes, which the Expressions API cannot.

Security

  • Giving users the ability to execute C# scripts introduces significant security risks, similar to allowing direct SQL statements (e.g., performance issues, resource use, SQL injection).
  • Malicious users could access file systems (e.g., System.IO.Directory.EnumerateFiles) to find and read sensitive files.
  • This risk extends to circumventing direct API usage checks through Reflection API calls (e.g., System.Type.GetType(“System.IO.File”).GetMethod(…)).
  • Furthermore, users might attempt to perform undesired mutations or persistence operations on application objects (e.g., calling a Save() method on a Person object that interacts with a database).
  • Security restrictions can be implemented by analyzing the script’s syntax tree and semantic model. A VerifyCompilation() method can traverse nodes and check for:
  • Specific method calls (e.g., Person.Save()).
  • Usage of members from blacklisted namespaces (e.g., System.IO or System.Reflection).
  • Custom diagnostics can be combined with compiler-generated diagnostics.
  • Additional security measures include:
  • API Exclusion: Blacklisting more potentially harmful APIs (e.g., System.Reflection.Emit).
  • Restricted UIs: Providing a limited user interface that generates script rather than allowing free-form code input.
  • Restricted User Accounts: Ensuring the identity used to execute the script has highly limited permissions to prevent interaction with sensitive system resources.
  • The sources emphasize that trying to limit what a script can do is non-trivial, and with flexibility comes responsibility and governance to prevent security holes.

In conclusion, the Scripting API is a welcome addition to C# that empowers developers to create dynamic and extensible applications, offering tools like the Interactive window and programmatic script execution. However, its use requires careful consideration of performance, memory overhead, and especially security implications.

The Compiler API: C# Development and Future Evolution

The future of the Compiler API, as discussed in the sources, is envisioned as a continuous evolution that will further empower .NET developers by enabling new tools and transforming the fundamental way C# code is written.

Current Usage of the Compiler API

Beyond enabling diagnostics, refactorings, and the Scripting API, the Compiler API’s functionality is accessible for use in any C# code, allowing developers to integrate it into their own projects via NuGet packages.

The sources highlight several examples of how the Compiler API is already being utilized:

  • Generating Mocks: Mocking frameworks, such as Moq and NSubstitute, traditionally synthesize new classes at runtime using System.Reflection.Emit. This process requires knowledge of Intermediate Language (IL), which can be difficult and prone to errors. In contrast, the Rocks mocking framework, created by the author, leverages the Compiler API to generate mocks. This allows for the dynamic creation of classes using pure C# code, making debugging generated code “extremely simple” because it works within the Compiler API’s intended design. For example, stepping into a generated mock in Visual Studio reveals a C# class with a Guid in its name to prevent collisions, inheriting from the target interface, and compiled with debug symbols.
  • Building Code with Code (Cake): MSBuild has been the standard for building .NET code, but other tools like Cake (http://cakebuild.net/) use the Compiler API to execute build steps. Cake defines a C#-like Domain Specific Language (DSL) for build processes. Developers write build scripts in C# syntax, declare variables, and use other .NET libraries. Cake tasks can have dependencies and execute code, such as building a solution with MSBuild() or running tests with MSTest(). This allows developers to automate complex build and deployment scenarios in a familiar language.
  • Other Tools and Frameworks: The Compiler API underpins a growing number of tools and packages, including:
  • DotNetAnalyzers and StyleCopAnalyzers (diagnostics enforcing coding rules).
  • ScriptCS (another C# scripting implementation).
  • OmniSharp (a .NET editor written in .NET).
  • RefactoringEssentials (a suite of refactorings and analyzers).
  • ConfigR (uses C# code for configuration files).

Looking into C#’s Future (Source Generators)

The most significant anticipated change to C# itself, empowered by the Compiler API, is the introduction of source generators. This experimental feature aims to allow code generation to become an “integral part of the language”.

The core idea is to introduce compile-time attributes that are “active” rather than “passive” metadata. When the C# compiler encounters these attributes, it would look for their presence and trigger their associated implementation to generate new code that augments the target class or member.

A prime example used to illustrate this is property change notification with INotifyPropertyChanged. Currently, implementing this interface often involves boilerplate code or relying on base classes, which restricts single-class inheritance in C#. With a hypothetical [PropertyChanged] attribute, a developer could simply write:

[PropertyChanged]

public partial class IntegerData

{

public int Value { get; set;}

}

The C# compiler would then automatically generate the necessary INotifyPropertyChanged implementation, including the PropertyChanged event and the logic within each property setter to raise the event when the value changes. This drastically reduces the amount of manual code.

This concept extends to other repeatable code generation scenarios, such as:

  • Object disposal (IDisposable) with checks for ObjectDisposedException on members.
  • Method call thresholds, like ensuring a Dispose() method is called only once, or a CallTwice() method is invoked a maximum of two times.
  • Consistent ToString() patterns for classes.

The generated code from source generators would still be C# code, making it fully analyzable and debuggable, just like code written manually. This approach promises to simplify implementations by exploiting patterns and aspects, allowing developers to write less boilerplate code.

The Compiler API is seen as central to the ongoing transformation of .NET, including the rearchitecting of the .NET Framework into .NET Core, and potentially future targets like WebAssembly, making C# a language capable of running natively in the browser. This open-source model encourages community contribution to its continuous evolution.

By Amjad Izhar
Contact: amjad.izhar@gmail.com
https://amjadizhar.blog


Discover more from Amjad Izhar Blog

Subscribe to get the latest posts sent to your email.

Comments

Leave a comment