Monday, November 23, 2009  
Google
Web pcquest.com

CIOL Network sites

Search by Issue | Sitemap | Advanced Search

• For most updated version of DQ TOP 20 issue, visit dqindia.com • Ad : Play and Plug ERP by IBM
 Home > Hardware

Xeon 5500 Series, Nehalem 2.9GHz Processor

Let's take a look at the latest offerings from Intel on Xeon platform and see how well it performs with its design

Anindya Roy

Friday, May 01, 2009

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter

The Nehalem based Xeon processor -Xeon 5500 Series is the first processor from Intel to have Native Quad Core support. We received a 1U rack server from Intel with two 2.9 GHz Nehalem processors and 24GB RAM. We ran quite a few tests on it, and as expected, the results were mind boggling. We will discuss the results later. First, let's take a quick look at some of the key features of the processor. The basic features are pretty much similar to the Core i7 desktop architecture and offer the following functionalities:

Native quad-core
Till now, all previous Xeon processors with more than two cores were built using multi-chip modules of dual core processors. So they were essentially two or three dual cores modules fixed in one chip creating quad or hex core CPUs. With Xeon 5500 series, there is native quad core design. This is similar to AMD's Phenom X4 CPUs. The same feature is also available with Intel's Nehalem based desktop processors, the 'Core i7'.

Price: On request
Meant for: Data centers
Key Specs: Native quad-core, inclusive level 3 cache, integrated memory controller, hypre-threading
Cons: None
SMS Buy 130569 to 56677

The advantages of having a native quad-core over an MCM (Multi core modules) are significant in terms of processor energy efficiency, performance, and dynamic scalability. We will see some of these in our benchmark results.

Inclusive level 3 cache
First showcased on earlier Xeon server chips and then on desktop Core i7 CPUs, the Xeon 5500 family of CPUs feature up to a massive 8MB of level 3 cache (shared between all four cores) as compared to 2 MB of Phenom X4. The cache is also described as an inclusive level 3 cache. Intel claims, an inclusive cache is more efficient than an 'exclusive' cache design, even if it does mean that 1MB of Nehalem's 8MB Level 3 cache is taken up by storing a copy of the 256 KB Level 2 cache inside each processing core.

Integrated memory controller
By modularizing the design of the CPU and the Northbridge, the memory controller has been brought to the Nehalem CPU die. The separate processing cores and caches are linked to the on board memory controller via a new bus standard called the QuickPath interconnect, replacing the conventional front side bus. As QuickPath replaces the Front side Bus (FSB), it also takes over the role of allowing the CPU to connect to other system components, buses and controllers such as the PCI Express controller and DDR3 memory, reducing latency and improving performance considerably.

This shows time taken by different problem sizes on 2 socket Nehalem.

Hyper-threading
Another feature worth mentioning is Hyper-threading. Using spare resources of a core to execute a second process thread, Hyper-threading enables a quad-core Nehalem processor to accept and process eight threads simultaneously, making it even more massively parallel and powerful than the current Core 2 Quad CPUs.

Performance results
We ran three benchmarks on the server -LINPACK, SunGard and Cinebench. Plus we also recorded its power consumption in different levels. For running all these benchmarks, we used Windows Server 2008 as the OS. Here is what all we got.

A dual socket Nehalem server is able to show 16 processors due to its Hyper-Threading capabilities.

Linpack
The test was really exciting with some really interesting results. Undoubtedly, this gave the best result when compared with Intel's Harpertown or Dunnington processors. But surprisingly, it even gave better performance than Dunnington with 24 Cores, even though it only had 8 Cores and 16 threads. The final result we got was a whopping 76 GFlops, which was 14 Gflops more than the 24Core (6Core * 4 Socket) Dunnington. This result was achieved with a problem size of 50000 and 16 threads in Linpack.

Another interesting observation is that, due to Hyper Threading, the processor was really getting an edge over its predecessor. When we ran the same problem on 8 Threads, which was equal to its actual number of cores, it gave much lower performance.

SunGuard
We used SunGard Adaptive Analytics as a component of SunGard's Suite of risk management products. More precisely, it is the stripped down version of the actual product. This benchmark utilizes Monte Carlo method financial engine to predict the future of a fictitious portfolio. It requires two different files to run. The first one contains sample data that represents the actual market condition and the second file contains the sample customer's investment portfolio. The benchmark scores are calculated on the base of time in seconds. So lesser the time it will take to run, the better thel performance. In this test, our server was able to finish the task in 130.5 seconds. If we compare it against Dunnington, with 3 times the number of cores, Dunnington was able to finish the same test in 105.9 seconds which is just 20% faster than Nehalem. If we multiply both results with the number of cores available in each server, we get 1044 for Nehalem and 2541 for Dunnington. If we see per core performance of both servers, Nehalem gave 2.4 times better performance than Dunnington. This is indeed a brilliant score.

CineBench
And finally we ran CineBench 10 x64. This benchmark measures the performance of processor and graphics card. This test consists of two parts: first is processor intensive and second is graphics intensive. Initially it makes use of a single CPU for running the test whereas the latter part uses all cores. In the graphics test, the test runs inside a 3D window. An animated scene is played starting with a low demand for graphics which is increased later. Finally a score is generated, when the processor works on maximum speed for the scene to be displayed properly. The higher the scores the better the server performance. The score we got for a single CPU was 4429, which was again 30% better than Dunnington, which gave around 3266 CBCPU with one CPU. With all CPUs, the score of Nehalem was 28667 CB-CPU.

Bottomline: If you are planning to scale your datacenter and want to buy servers which can cope up with mission critical virtualization and parallel processing, then no doubt this architecture is for you.

Page(s)   1  

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter


Untitled Document



ZTE:Leading CDMA Technology



   
 

 
 

Magazine Subscription | RQS | Contact Us | Team PCQuest | Advertising - Print | jobs@cybermedia