News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 3/15/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
ASP.NET MVC 2 Ships
ASP.NET MVC 2 has shipped.
03/12/2010 10:26 AM EST

Microsoft plans 'open' Silverlight analytics framework
Microsoft is going to announce a multipurpose analytics framework for Silverlight at MIX.
03/11/2010 09:51 AM EST

About CSS processing
Two sites that lead to a startling CSS conclusion.
03/10/2010 02:29 AM EST

 

Events calendar tab
3/14/2010 to 3/18/2010
Seattle, Wa.
SHARE

3/15/2010 to 3/18/2010
Santa Clara, Calif.
TechWeb

3/15/2010 to 3/17/2010
Las Vegas
Microsoft

3/16/2010 to 3/19/2010
Las Vegas
Penton Media

3/17/2010 to 3/19/2010
Las Vegas
TechTarget


 
Most Read Latest News Blog Resources

Making the Move To Multicore




December 15, 2006 — 
There was a time when clock speed and the number of transistors defined the next levels of microprocessor performance. Chip vendors are still increasing the number of transistors on a single chip, but they are no longer trying to double clock speeds every two years because thermal dissipation and power consumption have gotten out of hand.

“Data centers are using too much power, and at the same time servers are underutilized,” said Margaret Lewis, director of commercial solutions at AMD. “In Manhattan it either costs millions to bring more power in or there is no space to expand. Power, space and cooling are big issues.”

Intel had been increasing the clock speed of its single core (aka unicore) chips by 40 percent per year every two years until the power consumption reached about 100 watts, said Geoff Lowney, fellow in the digital enterprise group and director of compiler and architecture advanced development at Intel. About two years ago, the company decided that the best way to increase power was not to continue to increase frequency but instead to continue to increase the number of transistors on a single chip, utilizing a multicore architecture.

“We could still make unicores run faster, but it wouldn’t be as efficient,” said Lowney. “A 2x increase in speed does not yield a 2x gain in performance; however, if you add four times the transistors using two cores, you get 4x performance.”

Approximately 70 percent of all Intel microprocessors shipped by the end of the year will be dual-core or quad-core. Lewis said 70 percent of AMD’s microprocessors shipped by year’s end will be dual-core with a whopping 90 percent shipping for use in servers, desktops and workstations. AMD will unveil what it calls a “true” quad-core in mid-2007 that will feature four cores on a single piece of silicon. By contrast, Intel’s newly announced quad-core combines two dual-cores in a single package.

Multicore processors operate at lower frequencies than their unicore counterparts and therefore consume less power and dissipate less heat. The thermal dissipation and power consumption benefits are attractive to designers of both embedded and enterprise systems alike.

“The scales [of clock speed versus power consumption] are not linear,” said Michael Christofferson, director of product management at Enea. “When you increase clock speed, power consumption and the related costs go up significantly. Multicore doubles processing power with minor increases to power consumption.”

Aside from the “obvious” physical limitations, David Kleidermacher, CTO of Green Hills Software, said the functionality of devices is evolving in such a way that single-core designs are no longer a match.

Of course, the concept of using multiple processors is not a new one. Designers have been placing multiple processors on a single board for some time. The advantage of moving the processors or cores to a single-chip design is increased speed. As an example, the cores on an Intel Core 2 Duo communicate in a matter of nanoseconds. By contrast, multiple processors on a single board communicate in hundreds of nanoseconds, said Lowney.

Other benefits include parallelism, as well as a reduction in hardware, cooling and other power-related costs.

“Systems are getting smaller, while applications are becoming larger and more feature-rich,” said Kerry Johnson, a product manager at QNX Software Systems. “You need to be able to use multiple processors in a smaller footprint.”

MULTICORE BASICS
Multicore processors are being incorporated into servers, workstations, desktops, laptops, telecommunications infrastructure, handheld devices and gaming systems, to name a few.

In a dual-core microprocessor, there are two separate cores, each of which has its own Level 1 (L1) and Level 2 (L2) cache. L1 cache is dedicated to its respective core and is responsible for executing instructions and serving as a data cache for the most recently used data. L2 cache is shared between the two cores and either core can read or write to L2 cache, according to Intel’s Lowney. He also said the two cores can talk to each other through L2 cache without going off the die.

The two cores share a single system bus that connects to system memory.

Each core can run a separate operating system or be dedicated to specific tasks, which enables Linux and Windows operating systems and applications to run simultaneously on a single chip, for example. This is called Asymmetric Multiprocessing, or AMP.

Alternatively, one copy of an operating system can control all tasks performed on both cores, dynamically allocating tasks or threads to the underutilized core to achieve maximum system utilization. This is called Symmetric Multiprocessing, or SMP.

Most RTOS vendors support both; some, such as QNX, support a third mode called “Bound Multiprocessing,” or BMP (also known as “Core Affinity”), which combines the best features of SMP and AMP.

SYMMETRIC MULTIPROCESSING
The main benefit of symmetric multiprocessing is increased processing power over unicore processors. One core is responsible for overseeing all cores on the chip so the operating system is aware of what every CPU is doing. In SMP mode, the operating system automatically handles the complexity of load balancing and communication among cores, whereas AMP requires developers to handle the communication manually.

Debugging is also a consideration. One school of thought is that SMP mode eases the debugging process because the operating system is aware of everything occurring on the chip. Another view is that AMP is easier to debug because an operating system and its applications are tied to a specific core. However, in AMP mode, debugging occurs independently, so it may be difficult to figure out the interaction between the cores.

Instead of performing all tasks on a unicore, which has practical clock speed limitations, a multicore processor can deliver higher processing power by allowing tasks or threads to run in parallel.

“SMP was designed so you can mimic single-processor designs in a distributed computing environment,” said Enea’s Christofferson. “The major issue is concurrency. In a single operating environment, running multiple threads is a priority, so two threads with different priority levels can end up executing in parallel when they are distributed to different cores.”

This can create state issues. He also said that in many designs there are components within an operating system that may have hidden requirements that may not be running at the same time as another thread. To avoid the problem, Christofferson recommended that designers consider all the operating system or application threads to make sure there are no problems with concurrency.

At the RTOS level, the operating system needs to be able to support load balancing so processors or threads can be distributed to the underutilized cores. Load balancing is a big issue because how well the load balancing is executed determines how much bandwidth is available.

In a complex system, it may not make sense for every thread to be subjected to dynamic swapping, which is why it’s important to use the right debugging tools and system management software for fault handling, said Christofferson.

According to Green Hills’ Kleidermacher, SMP is the preferred choice for small-scale embedded systems that need a performance boost. Conversely, he said larger state systems, such as workstation clusters, are better suited to AMP.

Wind River Systems CTO Tomas Evensen thinks the reverse is true. He said SMP is ideal for the enterprise space because one operating system controls all cores. It allows you to add more threads or tasks to an application design because if too many threads are trying to execute simultaneously, they are simply moved to a free processor. The downside of SMP is that it’s hard to scale because the synchronization and communication among cores increase overhead. He predicts more AMP systems will show up in the embedded space because AMP dedicates cores to operating systems, applications and tasks.

“AMP is better for dedicated tasks,” he said. “You can use Linux and VxWorks simultaneously.”

ASYMMETRIC MULTIPROCESSING
AMP treats each core as a separate entity. It requires the operating system to consider which cores will have access to which peripherals and how the two cores will communicate to ensure that resources are allocated properly. At present, the IP protocol is generally used for intercore communication. The messaging between the cores increases system overhead and thus lowers net processing power.

“The problem with running two copies of two OSes is that there is no good standard for communicating between processors and OSes,” said Evensen. “IP is slow, which is why we’re pushing TIPC, an open-source project.”

Enea is pushing another IPC (interprocess communication) technology called Links that allows CPUs to communicate with each other and the outside world using other external devices. This is important, said Enea’s Christofferson, because many of his customers are combining Linux and OSE, dedicating one operating system to each core. OSE is used to control the device, while Linux is used because of the universal application its supports.

“It’s like the separation of control plane and data plane,” said Christofferson.

The main benefit of AMP is that operating systems, tasks and peripheral usage can be dedicated to a single core as necessary, which some say eases the transition from unicore designs to multicore designs, at least from a debugging perspective.

“There is a trick to figuring out how to accomplish device sharing between or among cores if you want to share a common device like an Ethernet device,” said Christofferson. “You need a driver, device handler or some other way to manage access.”

BOUND MULTIPROCESSING
Bound multiprocessing enables the use of a single operating system among all cores and at the same time enables tasks or resources to be dedicated to particular cores. The benefit of binding threads to a single core is that it improves performance by lowering overhead. The downside is that if a task has been committed to one core, it cannot take advantage of the other core’s processing power.

Christofferson said Enea’s customers want to know which mode (SMP, AMP or BMP) will give them double the performance. The actual performance depends on the implementation of the RTOS—whether it is implemented in a way that enables threads to be load-balanced quickly.

Green Hills adopted BMP to allow customers to dynamically switch modes so the designer can understand whether a problem is a multicore problem.

According to QNX’s Johnson, “Legacy code and the investments in it need to be preserved when adding new features. At the same time, you want new applications to take advantage of multicore processors. BMP allows you to select and fix the problem for multicore.”

His colleague Robert Craig, a QNX software manager, said if BMP is chosen at the application level, the mode can be switched without rebooting. By contrast, a designer must choose between AMP and SMP at the beginning and then can’t switch later.

“We wanted to give customers a choice so they can migrate at their own pace,” said Craig.

Creating multithreaded code is difficult for developers who are used to developing applications in a serial fashion that will run on unicore multiprocessors, said AMD’s Lewis. On the other hand, multithreaded programs that have been designed with parallelism in mind are the easiest to move to multicore, she said. When looking at a GUI, graphics or a physical engine, a developer should look at the code and consider what could be run simultaneously.

“There’s a certain amount of automatic parallelism in today’s compilers,” she said. “There is a move among chip and application vendors to increase the amount of automatic parallelism even more.”

Figuring out how to allocate tasks is one of the biggest design issues, said Wind River’s Evensen. “If you have a serial application, you need to figure out how to use multiple threads.”

One way to attack the problem is to dedicate different cores to different tasks. For example, in an MPEG4 video stream application, a developer could have several serial processes such as data stream capture, placement into memory, decryption and resolution scaling where the cores are used to create a pipeline.

Alternatively, if an application executes a number of independent tasks and one packet does not depend on the other, they can be split up and executed as parallel streams.

Green Hills’ Kleidermacher said that debugging can also be a problem because the traditional methods just don’t work, particularly in SMP mode. Synchronization problems don’t tend to show up during the testing or deployment phases.

“You may get fantastic visibility into both cores, but you may not be able to tell what they’re doing together,” he said. “Another problem is the weird state issues that arise when one core stops and the other runs.”

In the end, Evensen said there’s no one silver bullet—the selection of a mode just depends on the application.

“Use SMP if you need to predict which processors are running. Alternatively, if you want to set affinity, dedicating a task to a specific processor, then choose BMP but realize the more tasks you dedicate, the less load balancing will occur,” he said. “If you’re dedicating too many tasks, choose AMP.”


Share this link: http://www.sdtimes.com/link/29910
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading