At its recent Developer Summit, AMD has presented in-depth details on its graphics API, Mantle. While a full-fledged documentation is not yet available, the GPU manufacturer has provided enough information for developers to get a grasp of the fundamental workings of Mantle as well as the opportunities it may offer from a games development perspective.
Compared to DirectX or OpenGL, Mantle is a lower level graphics API that gives developers direct access to the GPU. This doesn’t necessarily mean that developers will have their work cut out for them by being asked to compromise on the ease of programing the GPU using a high-level language. AMD’s goal is to arrive at the lowest possible level with its API level of abstraction, which would ultimately make the most sense from the development perspective of graphics engines. Therefore, a part of Mantle would be at a lower level, while certain layers of abstraction would at a higher level.
Overcoming Conventional API Problems
AMD summarizes the following existing hurdles in development on PC:
- API overhead
- lack of efficient threading
- lack of effective control over memory management
- lack of direct access to the GPU
With Mantle, AMD looks to overcome these issues with the opportunity to pre-build and reuse certain data in order to reduce redundancy, to control memory management, and to control command generation of draw calls as well as their execution. The application generates lists of draw calls that are placed in the appropriate GPU queue. The fundamental difference is that, instead of the driver managing and controlling distributed rendering with varying degrees of effectiveness, the application manages rendering as finely as the developer requires. The aim, as AMD puts it, is to put the developer in the driver’s seat.
Reducing the extra cost of rendering commands is not enough, however. It also becomes important to provide efficient multi-threading. Unlike DirectX11, Mantle allows the application to prepare different lists of commands and control multi-threading without sacrificing performance or reliability.
One of the main problems of the conventional PC API is the additional cost they generate for rendering commands (draw calls), whether during the various compliance checks at their translation into the native commands GPU etc.
AMD explains that most of the PC games of today are using 3000 to 5 000 draw calls, a figure that may rise to 10,000 for developers who fully optimize their rendering techniques. With Mantle, AMD is offering the potential to exponentially increase this figure to as high as 100,000 draw calls, in turn giving developers the opportunity to amplify the capacity of their engines for richer game worlds with far greater visual fidelity in terms of object and environmental detail as well as GPGPU-driven physics simulations.
While seeking an optimal level of performance and adding features, AMD has made it a point to ensure a relatively simple API that allows developers to easily visualize its operation and predict behavior. This does not mean that fully exploiting Mantle would be an easy task, but the good news for developers is that there will be no troubles hidden behind higher level complexities.
The application also acquires the ability to take control of multi-GPU configuration and decide where to run each issued command. What AMD is essentially offering is direct access to data transfer between multiple GPUs, with flexible workload scaling and partitioning. It will also support asymmetric multi-GPU systems, such as an APU working in conjunction with a dedicated GPU. This leads to the idea of novel usage scenarios where the GPU may handle the rendering load and offload all post-processing to the APU.
Mantle also introduces a new type of object called the monolithic pipeline, which includes all shaders and GPU states etc within a single block footprint. The flexibility available with this pipeline should pave the way for new types of rendering impossible at present with a high-level API. With DirectX , managing GPU states and shaders can result in heavy CPU load. Mantle reduces the burden by facilitating the work of the compiler and taking advantage of a global vision of rendering.
Decoupled GPU Memory
In existing applications, various buffers are assigned a memory area by the driver in a very rigid way, which makes it difficult to reuse some memory areas and multiply the number of these areas to manage and boost the total memory consumption. This is partly why the memory utilization of 3D rendering on PC is usually much higher than that on consoles, where developers can make the most of available resources.
Mantle allows the application to directly control the transfer of data and management of video memory in order to overcome the existing inefficiencies. Since the application knows exactly what it needs to do, many generic checks are no longer relevant. The application gains control of not only the video memory, but also the system RAM. This, in turn, leverages GPU Memory Virtualization.
Ultimately, AMD’s goal with Mantle is to extract greater performance out of entry-level systems, to enable developers to better predict the behavior and performance of their engines as well as sharing optimizations between PC and next-gen consoles, and, in the long run, to pave the way for new rendering techniques.
The GPU manufacturer states that development is on track and that the forthcoming Mantle specific patch for Battlefield 4 is scheduled for December. As of now, Mantle is available as an Alpha release to a handful of developers, although a Beta version should be available by next month. A full-fledged public documentation is expected to become available by GDC in March 2014. The final version is scheduled for availability during the second half of next year.
AMD insists that Mantle was not intended to be limited to a particular architecture. At its base level, Mantle is comprised of relatively generic functions that could be supported by other architectures. Support for functions specific to its Radeon GPUs is included as an extension. This implies that Mantle could potentially become a standard with multiple extensions to cater for architectural differences. However, this is all in theory and it remains to be seen if the technology is feasible in the long run.