Things you probably shouldn't do: Bending IL2CPP to your will

Disclaimer: These blog posts will be discussing some... unorthodox approaches. If you think pointer math and gotos in C# are sacrilege then these posts might not be for you.

Hello and welcome to this blog! I will be writing a series of blog posts about my experience with different Unity topics. Developing CodeFiCS requires a low-level approach and takes me places a lot of developers usually don't go. It is my hope that these posts will help you understand some inner workings of Unity and allow you to write better code and create more ambitious projects.

Today I'll be talking about some lesser known features of IL2CPP builds. IL2CPP does a lot more than just make your code run faster, but to make sure we're on the same page let's start with some basics.

What is IL2CPP?

You may know IL2CPP simply as a tool for converting C# code into C++, but it is so much more than that! C# isn't compiled into native code (instructions that CPUs execute) - it is compiled into Intermediate Language (IL in IL2CPP). IL resembles assembly, but has a number of abstractions (ex. using field tokens instead of memory offsets when loading/storing a value). You cannot run C# executable without an additional layer - Common Language Runtime (CLR). It plays two roles: First and foremost it compiles IL into native code that is suited for current CPU architecture. This process is known as Just-In-Time compiling (JIT). It determines the best way to lay out type fields and generate native code based on CPU capabilities and other parameters. This allows C# to run pretty efficiently on any machine with CLR, but can have a significant performance impact, especially when a bunch of methods are called for the first time and need to be compiled.

The second role of CLR is to provide a number of features to allow generated code to run. This includes things like garbage collection, platform-specific APIs, metadata, implementations of a bunch of mscorlib methods and much more. Without it IL is nothing more than a fancy low-level pseudocode. So how does IL2CPP factor into this? Well if you simply convert IL code into C++ it will be missing all of those features that CLR provides. Accessing DateTime.Now will fail since there will be no layer for C++ code to access OS' time-related API. So how does IL2CPP solve this? By providing its own CLR! IL2CPP CLR is based on Mono. You can find its source code in %UnityEditorFolder%/Data/il2cpp/libil2cpp folder or in IL2CPP builds you make if you use the Create Visual Studio Solution option. IL2CPP has some commonalities with Mono, but differs quite significantly in numerous implementations. IL2CPP builds need to be compiled Ahead-Of-Time (AOT, opposite of JIT) to allow execution on platforms that prohibit runtime code generation, so you will quickly notice that the whole JIT compiler is missing. With it goes metadata that was designed to support it or that is its byproduct. A lot of core metadata types (MonoClass - CLR representation of System.Type, MonoMethod - CLR representation of System.Runtime.MethodBase, etc.) have been redesigned from ground up to simplify access to data. IL2CPP has started out as a branch of Mono, but has grown into its own CLR with a lot of interesting things to uncover.

So how does Unity factor into this?

When Unity calls IL2CPP it passes all assemblies that need to be converted as a parameter. This naturally includes all user assemblies and also managed Unity ones (UnityEngine.dll, UnityEngine.Physics.dll, etc). One thing it does NOT include is Unity's internal unmanaged assembly (UnityPlayer.dll). It is at the core of Unity and its source code is held under a lot of secrecy, so naturally it cannot be included as source code into builds. It does, however, need to interact with IL2CPP-generated code, so how exactly does that happen? If you build a Unity project into a Visual Studio solution, you will notice that the solution is split into 4 projects.

The first project - Il2CppOutputProject - is where all the C#-converted code resides. It also contains IL2CPP CLR and a few other things. UnityData contains some defines and resources and can be ignored in most situations. UnityPlayerStub is at the core of the mystery and I'll get back to it in a bit And last but not least, a project whose name matches the solution name (and the name of your Unity project). There's not much here aside from a few defines, but it has the most important piece of the whole application - the Main method (well actually it's wWinMain for Windows Standalone builds, but it serves the same purpose and I'll be referring to it as Main for brevity).

This method quickly transfers control to a UnityMain method, but if you look up its definition you will be disappointed as all it does is return 0; How come? How is your game able to run if there's nothing happening in Main method? UnityMain is defined in Exports.h file of UnityPlayerStub project. It is the only piece of code in that project, and it seemingly does nothing, so why is it there? Take a moment to consider all the pieces at play: you have your Main method in one project that needs to call UnityPlayer.dll that needs to call methods from Il2CppOutputProject all without revealing UnityPlayer's inner workings and compiling for different architectures (unrelated to this post, but an important piece). UnityPlayerStub is actually compiled into UnityPlayer.dll, but it is not the one that gets included into final builds - this one is ~36kb, while the one containing Unity's guts is +40mb. This effectively illustrates a neat little trick - Visual Studio solution includes a 'placeholder' project that gest substituted in place of Unity internals and, once compiled, is replaced with the actual thing (technically the actual implementation is copied into the build folder when Unity creates Visual Studio solution, but that's semantics). And all that is possible thanks to two magical words: "__declspec(dllexport).

So what does it mean?

Internals of dllexport (and dllimport) are outside of the scope of this post, but in (extremely) oversimplified terms it is C++ equivalent of P/Invoke. It allows DLLs to store infromation about methods they declare to allow other DLLs/executables to link to those methods at runtime. This is why UnityPlayer.dll trick works - your project isn't compiled referencing the UnityMain method from UnityPlayerStub project - it is compiled to reference that method in UnityPlayer.dll and it doesn't care where that file comes from. Similarly, UnityPlayer references a number of methods from Il2CppOutputProject.

Why is this useful?

UnityPlayer is a beast and needs access to a lot of methods and data to function. Doing something as simple as calling Update methods on active MonoBehaviors requires extensive konwledge of your game's functionality. So in order to achieve that (and much more) "Il2CppOutputProject dllexports numerous methods and UnityPlayer binds to them at runtime. This is a neat little trick that isn't very useful to us until you consider that Unity has a special meaning for DllImport("__Internal"). You may have come across this if you ever tried using native plugins on iOS or had C++ code in your Unity project. I will be talking more about the mechanics of this trick in another post, but DllImport("__Internal") allows you to call dllexported methods from within Il2CppOutputProject. Usually those are methods you define yourself, but, as I have found out, it also works with IL2CPP API. Any method that is dllexported from "l2CppOutputProject can be DllImported in C#, and IL2CPP CLR has a bunch of them! There are over 200 dllexported methods you can access, and most of them are in \Il2CppOutputProject\IL2CPP\libil2cpp\il2cpp-api-functions.h. These methods can help you tweak performance of your application, profile data layout or even do things you may have thought impossible in C#. There is no point in listing them all (especially since they're easy to find), but here are a few to peak your curiosity.


                            il2cpp_object_get_size
                            il2cpp_gc_set_max_time_slice_ns
                            il2cpp_gc_foreach_heap
                            il2cpp_gc_disable / il2cpp_gc_enable
                            il2cpp_stop_gc_world / il2cpp_start_gc_world
                            il2cpp_thread_walk_frame_stack

Using these methods is as simple as
[DllImport("__Internal", CallingConvention = CallingConvention.Cdecl, EntryPoint = "il2cpp_object_get_size")] public static extern uint GetObjectSize(object obj); It goes without saying, but this trick only works in IL2CPP builds. There are ways to do similar things in Mono, but that's a topic for another post. Naturally extreme caution and rigorous testing should be exercised when doing this as some methods can have huge impact on your apps (ex. turning off GC is an easy way to run into an endangered species - OutOfMemoryException). API can also change between major Unity releases, but in my experience methods in il2cpp-api-functions.h are fairly consistent.