Today ARM announces two new additions to its CoreLink system IP design portfolio, the CCI-550 interconnect and DMC-500 memory controller. Starting off with the CCI announcement, we find the third iteration of the Cache Coherent Interconnect. The CCI is the cornerstone of ARM’s big.LITTLE strategy as it provides the required cache-coherent system interconnect between CPU clusters and other SoC blocks such as the main memory controllers and thus enabling heterogeneous multiprocessing between all the IP blocks.
The CCI-550 is an improvement to the CCI-500 which ARM announced back in February among other IPs such as the new Cortex A72 core design. Both the CCI-500 and the new CCI-550 are generational successors to the CCI-400 that is found in all currently released big.LITTLE SoCs such as Samsung’s Exynos, MediaTek’s Helio or Qualcomm’s Snapdragon designs. Back in February I was pretty excited to see ARM improve this part of their IP portfolio as it seemed that there was a lot of optimization that could be done in terms of performance and power.
As a reminder, the primary characteristics of the new CCI-5X0 designs is the addition of a snoop filter within the interconnect that is able to maintain a directory of all cache contents among its coherent agents. On previous IP such as the CCI-400, all coherency messages needed to be broadcasted among all agents, causing them to have to wake up and respond. This not only impacted performance due to the increased latency but also had a power impact caused by the processing overhead. For the new CCI family, ARM explains that in heavy use-cases the new snoop filter can save up to “100’s” of milliwatts of power which is a quite significant figure.
Due the broadcast nature of how the CCI-400 was operated, it meant that adding another coherent agent would have incurred a quadratical increase in the amount of messages such as snoop lookups. The CCI-500 on the other hand is able to take advantage of the new filter to increase the number of ACE (AXI Coherency Extension) master ports from 2 to 4 without increased overhead. This for example enabled the implementation of up to 4 CPU clusters if a vendor wished to do so. The new CCI-550 again improves this configuration option by raising the maximum number of ACE master ports to up to 6.
In the example SoC layout diagram that ARM provides, we see the CCI-550 configured with two CPU clusters such as the Cortex A53 and a Cortex A72. The remaining four ACE master ports could be then dedicated to a fully coherent GPU.
ARM explains that its still to-be-announced next-generation Mali IP codenamed "Mimir" will be fully cache-coherent and would be a perfect fit to take advantage of such a configuration (Current generation Midgard-based GPUs such as the T6-/7-/800 series are only I/O coherent). Fully coherent GPUs will be able to take advantage of shared virtual memory and new simplified programmers models provided by APIs such as OpenCL 2.0 and HSA.
While the amount of ACE master ports increases from 4 to 6, the amount of possible memory interfaces has also gone up from a maximum of 4 to up to 6. This allows an increase of up to 60% in the total peak interconnect bandwidth (total aggregate bandwidth). This improvement not only comes from the two additional memory interfaces, but also an additional increase which can be credited to micro-architectural improvements on the interconnect itself. For example, we're told the CCI-550 is able to reduce CPU-to-memory latency by 20% when compared to the CCI-500.
ARM explains that its CCI IP is highly customizable and thus each vendor can configure it to their needs. The IP will be able to scale in terms of physical implementation based on the number of desired interfaces and ports.
As an IP vendor, ARM is aiming to provide highly optimized integrated solutions, and memory controllers are consequently part of such designs. ARM previously offered the DMC-520 with DDR4 support but this memory controller was aimed at more complex enterprise designs employing AMBA 5 system IP such as ARM’s CCN (Cache Coherent Network). The DMC-500 announced today on the other hand is ARM’s first mobile-targeted memory controller with support for the new LPDDR4 memory standard. Aimed for AMBA 4 system IPs such as the CCI family, this is the memory controller IP we’ll most likely see adopted by vendors in consumer devices such as smartphones.
The DMC-500 promises support for LPDDR4 up to 2133MHz while still maintaining LPDDR3 compatibility. This is an important differentiation factor as in doing so ARM is able to offer maximum flexibility in terms of choice of implementation for vendors. Performance wise, ARM promises up to 27% increase in memory bandwidth utilization in a low power design.
All in all today’s announcements provide some solid improvements in ARM’s IP portfolio. On the memory controller side I’m not certain what the rate of adoption ARM’s DMC’s is; as far as I know the main "heavyweight" SoC vendors currently chose to employ their own memory controller IP. Those who don’t have their own IP and instead use ARM's designs are often hard to single out as many times the choice of memory controller is completely invisible to the system.
On the interconnect side I predict that we’ll be seeing a lot more discussions and developments from third-party vendors. Even among today’s higher-profile big.LITTLE SoCs I’m only aware of LG’s Odin to use ARM’s CCI as a “center-piece” in their SoC fabric while other vendors chose to implement it alongside their own interconnect fabric. Vendors who have the resources and design talent may also chose to implement cache coherency into their own interconnect IP. They would thus be able deploy big.LITTLE systems or other similar fully coherent SoCs without ARM’s CCI IP. For example, MediaTek is among the first to do exactly this in the Helio X20 with help of the in-house designed MCSI. Next year we should be seeing new big.LITTLE SoCs equipped with both ARM’s IP such as the CCI-500 or 550 alongside third-party IP, creating a new differentiation point for SoC vendors that will undoubtedly make competitive landscape much more interesting.
from AnandTech http://ift.tt/1NxCUJj
via IFTTT
No comments:
Post a Comment