18 Eylül 2016 Pazar

Performance Optimization in Unity3D for Mobile Games

This is the sixth post on Unity3D so far, if you wonder what I have said before, please take a look:


Performance is important. Nobody argues with that, it is important on every software projects backend, web, mobile it doesn't matter. Whatever technology it is, whomever using it, everyone wants that product to be fast, smooth and responsive.

In casual games, may be  it is a little bit more important. Since it is casual game, it should open very quickly because you just want to play the game for 5 minutes, you should not wait for minutes to open. And more importantly it should work on any kind of device not just high end devices, because you don't expect anyone to have that device only for game purposes.

When you consider all of these, performance is not something you should leave behind. You have to consider it in the very beginning of the project and also keep in mind while you are developing any feature. 

With Unity it is harder I guess. Because it is a black box engine that support multiple devices from mobile ones to game consoles. It is easy to create a game project and develop a playable version. But it is really hard to polish it in means of performance. It has a lot of adjustment points, some of them makes console games much better, but makes mobile game suffer. So it is important to have good knowledge and experience on Unity to optimise your game. 

There is a talk in Unite 2016 Europe about Optimizing Mobile Applications, I strongly advice this talk. 

The main things you'll want to consider are the size of your game, how much memory your game uses, and how much computing power your game needs. When targeting mobile devices, remember that a large portion of the market has phones that are a few years old. To reach the widest possible audience, you should optimize your game for low-spec devices. Unity releases very useful statistics about the specs of mobile devices running Unity games. 

I will mention three practical points about what we have done in our card games to improve performance. 

Sprite Sheets


In computer graphics, a sprite is a two-dimensional bitmap that is integrated into a scene. To render / draw that sprite into scene, game engine has to issue a draw call to the GPU. All the purpose is reduce Draw Calls and increase Total Batched Calls. A draw call is one pass from a GPU. The goal is that we want to have everything batched because batching is what happens when we actually talk to the GPU, so the more draw calls we have batched into a single pass means that the performance will be better inside of the rendering of our game.

To see how these objects rendered into screen we are using Frame Debugger. And it shows that how sprites rendered in which order. When you look at the gif image in the left, you can see some sprites rendered only single call, some are rendered together with other sprites, which means they rendered in batch. No matter how many sprites exists while they are in batch it is count one draw call means one call to GPU. To be able to batch all sprites we are using sprite sheets to load every sprite into one big atlas file and load to GPU and increase the Bathced Calls and reduce the Draw Calls. 

A sprite sheet (atlas file) is a bitmap image file that contains several smaller graphics in a tiled grid arrangement. By compiling several graphics into a single file, you enable Animate and other applications to use the graphics while only needing to load a single file. Unity has built-in Sprite Packer to create sprite sheets from your sprites. We are using TexturePacker which is a better packer according to Unity's built-in one.

One other key point is the scene hierarchy. Unfortunately it is also impacting the draw calls count especially if you have multiple sprite sheets. Let's say you have two sprite sheets and you need to render some sprites from one, and some from other in the same scene. Try to organize your hierarchy to gather all your sprites that comes from same sprite sheet, so that they will be rendered in batch with one draw call. And then put others that comes from second sprite sheet.

Unity Profiler


In the editor there are two types of profile tools. 

One is called Stats Window. While your game is running, click on the Stats button at the top right of the game window to open the Stats window. Here you can see some basic stats about your game. At the top is your frames per second, which is not that useful since it will be different when running on a mobile device. But things like the number of draw calls, triangles, and textures should be consistent with what you'll get on mobile. The more draw calls and triangles you have, the more processing power your game will require to render your scene. There's no hard and fast rule as to many draw calls you should have, since it depends a lot on what shaders you use. Under 50 draw calls should run quickly on most devices although newer devices and tablets can handle more. If your game is running slowly, and you see that you have hundreds or thousands draw calls, that's probably one of the reasons.

The number of textures being used is another good metric to consider. If you're loading and unloading textures at run time, like we did when switching atlases, the number of textures in the stat window will let you check if they are being de-allocated properly or if they're still in memory. You can also see how much memory is being used by textures at any given time.

Staying under 30% of the available memory on the device is a good idea. But remember that there are other things in the scene that also use memory, besides the textures. Even without the pro version of Unity, you can get some useful information about the performance and memory usage of your game using the Stats window.

Second one is Profiler Window. The Unity Profiler Window helps you to optimize your game. It reports for you how much time is spent in the various areas of your game. For example, it can report the percentage of time spent rendering, animating or in your game logic. You can play your game in the Editor with Profiling on, and it will record performance data. The Profiler window then displays the data in a timeline, so you can see the frames or areas that spike (take more time) than others. By clicking anywhere in the timeline, the bottom section of the Profiler window will display detailed information for the selected frame. Let me tell you what you can measure with profiler. Before moving that part, you already might know that profilers also add some overhead on the performance. So when using profiling it is typical to consider only the ratio (or percentage) of time spent in certain areas. 

The CPU Usage Profiler displays where time is spent in your game. When it is selected, the lower pane displays hierarchical time data for the selected frame. See documentation on the Profiler Window to learn more about the information on the Profiler timeline.

The Rendering Profiler displays rendering statistics. The timeline displays the number of Batches, SetPass Calls, Triangles and Vertices rendered. The lower pane displays more rendering statistics, which closely match the ones shown in the Stats Window.

The Memory Profiler shows a simple overview how memory is used throughout Unity in real-time on a per-frame basis. And also allows you take a snapshot of the current state. After taking a sample, the Profiler window is updated with a tree view where you can explore memory usage.

The Audio Profiler monitors significant performance meters about the audio system, such as total load and voice counts. When you highlight the pane, the lower part of the window changes into a detailed view about various parts of the audio system not covered by the graphs.

The Physics Profiler shows the following statistics like the numbers of Active and Sleeping Rigidbodies, Static and Dynamic Colliders.

The GPU Profiler is similar to the CPU Profiler, with various contributions to rendering time shown as a hierarchy in the lower panel. Select an item from the hierarchy to see a break-down of contributions in the right-hand panel.

For more information about Unity Profiler please refer to official Unity documentation.

One last thing is there is a new very low level memory profiler API. It can tell you which objects got blamed for how much c++ memory allocations. It will also give you a dump of the entire c# heap, as well as c# type descriptions. This API is too low level for most people to benefit from. The intention is to write a much nicer UI window on the top of this API that will actually be readily useful for many users, in helping them figure out which objects are loaded, which objects take a lot of memory, and most important, why that object is in memory. This repository is that nicer UI window, very much in progress of being built. Actually it is an open source project by Unity.

Xcode Instruments


There is an official Unity post about how to use Xcode as profiler. Officially Unity team also suggesting to use Xcode Instruments. Even if you don't have any performance problem, still it would give you a lot of details about how Unity is working under the hood. Because profiler will not just profile and show your code, but also internal codes of Unity.

Xcode Instruments can see the things that Unity Profiler can't see such as application startup time. In the Time Profiler you can see what makes your app waits while showing splash screen. Most of the time spent in the internal Unity codes, but there are two points that your code impacts this time. 

If you look at the image you will see PlayerLoadGlobalManagers which is responsible to index your Resources. So if you have too many items in this folder it will directly impact your apps startup time. In the beginning we were putting all our resources into this folder even it is not needed to load it dynamically. After learning this fact, we just leave only the dynamic images in the folder and gain around 5.4% startup time decrease. 

Second item is the Awake functions of game object attached in the first scene (consider that we have only one scene). All Awake functions will be called during the splash screen. We have calculated that 19.2% of startup time is because of these functions. When we investigate deeply we understand that almost all of this time is being spent while calculated dependency graph by StrangeIoC. We didn't do anything to reduce this time yet.

On the other hand while running game, you can profile the update loop of UnityEngine. And you can see the internal code on each frame as well as your code executed on Update, FixedUpdate and LateUpdate functions. And again your code executed on DelayedCallManager which is there for your Coroutines. And you may also see some extra time on CanvasManager which is responsible for all Unity's UI system like calculations of text sizes and rendering of texts etc..

Another very handy instruments of Xcode is Memory Leaks, Xcode will show you the memory leaks of your code (sometimes it is internal code), but sometimes it gives you a good idea to handle or get rid of this leak. 

Actually other than memory leaks, you might still have memory issues. Especially the most common one is Memory Fragmentation. Unity has automatic memory management which is cool. But if your game creates a lot of temporary object while executing your code, this automatic memory allocations by Unity will end up memory fragmentation. Don't forget in Unity, the heap only expands, never shrinks. So it is very very important to avoid temporary objects. To avoid them you have to know when they are created by Unity. Here it is the most commonly used ones:
  • Use simple for loops instead of foreach loops. The reason is that a foreach loop internally creates a new enumerator instance.
  • Boxing is the process of converting a value type to the type object or to any interface type implemented by this value type.
  • String Concatenation also creates temporary string objects, instead String Builder can be used. 
  • All Unity APIs that return arrays (CreateScriptingArray) will allocate new copies of arrays. Minimize calls to these methods.
  • Linq usage will create extreme amount of temporary objects so examine the time lost to creating and discarding Linq queries; consider replacing hotspots with manually optimized methods.
So I think this is all I want to mention about optimizing the mobile Unity game. Actually I was planning write a few lines about optimizing the size of the binaries (on iOS and Android) but I guess it is part of another topic let's see.

I am almost done with this Unity related blog posts. A few of them left. Just hang in there :)
Yorum Gönder