OpenPOWER Academic Group Carries 2016 Momentum to New Year

By Ganesan Narayanasamy, Leader, OpenPOWER Academic Discussion Group

Academia has always been a leader in pushing the boundaries of science and technology, with some of the most brilliant minds in the world focused on how they can improve the tools at their disposal to solve some of the world’s most pressing challenges. That’s why, as the Leader of the OpenPOWER Academic Discussion Group, I believe working with academics in university and research centers to develop and adopt OpenPOWER technology is key to growing the ecosystem. The Academia Group is enabling several academicians to do research and development activities using Power CPUs and systems and this creates very strong ecosystem growth for OpenPOWER-based systems. Continue reading

Bringing OpenPOWER Outside of the Data Center

By Timothy Pearson, Raptor Engineering


Ever wish you could use something other than an insecure x86 or low-powered ARM machine to communicate with the OpenPOWER server sitting in your data center? Wish no longer! Meet the Talos™ workstation-class ATX mainboard, built on OpenPOWER and bringing the security and open systems advantages of POWER8 out of the data center and onto your desk.

OpenPOWER-member Raptor Engineering is committed to making owner-controllable, Libre-friendly systems available for engineers, programmers, data analysts, as well as anyone else who needs serious computing power, security, and flexibility all in the same machine. The OpenPOWER Foundation provides access to the only modern, performant architecture and shipping CPU that meets these criteria—so OpenPOWER is a perfect fit for our Talos™ machines. Talos™ also shines in storage servers and network processing, where the large number of PCIe 3.0 slots combined with POWER8’s I/O performance provides both configuration flexibility and high performance.

Meet Talos

The Talos mainboard hosts a single socketed POWER8 processor and two Centaur DDR3 memory buffers on a standard ATX mainboard.  It includes significant I/O and memory expansion capabilities, including 8 DDR3 ECC memory slots and 7 PCIe slots (56 total PCIe 3.0-capable lanes!), along with the wide variety of on-board peripherals expected in a workstation class mainboard.  Unlike existing OpenPOWER machines, Raptor Engineering has gone one step further and is using reprogrammable logic devices (FPGAs) that have an open toolchain available, making Talos™ completely self-hosting! If you need to modify any aspect of the Talos™ firmware or reprogrammable logic, you can completely recompile and resynthesize the firmware using your Talos™ machine instead of having to fall back to an x86 or Microsoft® Windows® environment.  We have also been instrumental in securing the release of the SBE/Winkle engine code, and as a result the Talos™ mainboard is completely open down to the lowest level firmware and machine schematics, making it an ideal research and development platform to explore next-generation technologies such as CAPI.

talos Thanks to IBM’s support of Linux on OpenPOWER, Talos™ is ready to run using a variety of modern Linux distributions. We have tested and qualified a wide variety of hardware on our POWER8 SDV for use with Talos™, including GPUs, Mellanox Infiniband devices, and much more. Thanks to POWER8’s little endian support, most Linux drivers simply work, and those few that exhibit minor issues due to faulty x86-centric coding are usually trivial to fix. We also plan to work with BSD developers to port one or more of the BSDs to OpenPOWER in support of Talos™, opening the world beyond x86 even wider.

Learn More About Talos

Visit the Talos product page to watch videos, read white papers, and learn about how the new Talos workstation brings the data center to your desk!

Expanding Ecuador’s Supercomputing Future with Yachay and OpenPOWER

By Alejandra Gando, Director of Communications, Yachay EP


The pursuit of supercomputing represents a major step forward for Ecuador and Yachay EP, with IBM and OpenPOWER, is leading the way.

Yachay, a planned city for technological innovation and knowledge intensive businesses combining the best ideas, human talent and state-of-the-art infrastructure, is tasked with creating the worldwide scientific applications necessary to achieve Good Living (Buen Vivir). In its constant quest to push Ecuador towards a knowledge-based economy, Yachay found a partner in OpenPOWER member IBM to create a source of information and research on issues such as oil, climate and food genomics.

Yachay will benefit from state of the art technology, IBM’s new OpenPOWER LC servers infused with innovations developed by the OpenPOWER community, in the search and improvement of production of non-traditional exports based on the rich biodiversity of Ecuador. It will be able to use genomic research to improve the quality of products and become more competitive in the global market. Genomic research revolutionizes both the food industry and medicine. So far the local genomic field had slowly advanced by the amount of data, creating an obstacle to the investigation.

“For IBM it is of great importance to provide an innovative solution for the country, the region and the world, in order to provide research and allow Ecuador to pioneer in areas such as genomics, environment and new sources of economic growth” says Patricio Espinosa, General Manager, IBM Ecuador.

Installed in an infrastructure based on the IBM POWER8 servers and storage solutions with software implementation capacity of advanced analytics and cognitive computing infrastructure, this system acquired by Yachay EP enables the use of HPC real-time applications with large data volumes to expand capabilities of scientists to make quantitative predictions. IBM systems use a data-centric approach, integrating and linking data to predictive simulation techniques that expand the limits of scientific knowledge.

The new supercomputing project will allow Yachay to foster projects with a higher technology component, to create simulations and to do projects with the capacity of impacting the way science is done in the country.

Héctor Rodríguez, General Manager of the Public Company Yachay, noted with pride the consolidation of an increasingly strong ecosystem for innovation, entrepreneurship and technological development in Ecuador.

Once the supercomputer is in place the researchers at Yachay will be able to work in projects that require supercomputing enabling better and faster results. By using the power of high performance computing in these analyzes it enables different organizations or companies to maximize their operations and minimize latency of their systems, allowing them to obtain further findings in their research.

Want to learn more? Visit (available in English and Spanish) follow us on Twitter at @CiudadYachay.

New PGI compilers enable seamless migration of GPU-enabled HPC applications from Linux/x86 to NVLink-enabled OpenPOWER+Tesla

By Doug Miles, director of PGI compilers & tools, NVIDIA Corporation

NVIDIA introduced the first production release of the PGI Fortran, C and C++ compilers with OpenACC targeting Linux/OpenPOWER and Tesla computing systems, including IBM’s OpenPOWER LC servers that combine POWER8 CPUs with NVIDIA NVLink interconnect technology and NVIDIA Tesla GPU accelerators.

Simplifying Migration from Linux/x86 to Linux/OpenPOWER Processor-based Servers

PGI for OpenPOWER enables easy porting of PGI-compiled HPC applications from Linux/x86 to Linux/OpenPOWER, often through a simple re-compile, including support for OpenMP 3.1, OpenACC and CUDA Fortran parallel programming. A good example is the WRF weather research and forecasting model, which together with its various support packages is comprised of over 800,000 lines of mostly Fortran source code. The OpenMP version of WRF 3.8.1 can be compiled on either Linux/OpenPOWER or Linux/x86 using the new PGI 16.10 compilers with identical makefiles, compiler options, source code and open source support packages:


Use at Oak Ridge National Laboratory  

The PGI compiler suite for OpenPOWER is among the available tools Oak Ridge National Laboratory will use to build and run large HPC applications on x86 CPUs, OpenPOWER CPUs and NVIDIA GPUs using the same source code base.

“Porting HPC applications from one platform to another is a significant and challenging effort in the adoption of new hardware technologies,” said Tjerk Straatsma, Scientific Computing Group Leader at Oak Ridge National Laboratory. “Architectural and performance portability like this is critical to our application developers and users as we move from existing CPU-only and GPU-enabled applications on machines like Titan to DOE’s upcoming major systems including the Summit system we’re installing at ORNL.” The upcoming CORAL Summit system at ORNL will be based on POWER9 CPUs and NVIDIA Volta GPUs.

OpenACC: The Easy On-ramp to GPU Computing

In addition to ease of porting between Linux/x86 and Linux/OpenPOWER platforms, the new PGI compilers support OpenACC directive-based GPU programming in Fortran, C and C++ for an easy on-ramp to GPU-computing with NVIDIA Tesla accelerators. As an example, consider this code fragment from the OpenACC version of the CloverLeaf mini-app, originally developed by AWE in the UK:

66 !$ACC DATA &
 67 !$ACC PRESENT(density0,energy0,pressure,viscosity,volume,xarea) &
 68 !$ACC PRESENT(xvel0,yarea,yvel0) &
 69 !$ACC PRESENT(density1,energy1) &
 70 !$ACC PRESENT(xvel1,yvel1) &
 71 !$ACC PRESENT(volume_change)
 73   IF(predict)THEN
 76     DO k=y_min,y_max
 77       DO j=x_min,x_max
 79         left_flux=  (xarea(j  ,k  )*(xvel0(j  ,k  )+xvel0(j  ,k+1) &
 80                                     +xvel0(j  ,k  )+xvel0(j  ,k+1)))*0.25_8*dt*0.5
 81         right_flux= (xarea(j+1,k  )*(xvel0(j+1,k  )+xvel0(j+1,k+1) &
 82                                     +xvel0(j+1,k  )+xvel0(j+1,k+1)))*0.25_8*dt*0.5
 83         bottom_flux=(yarea(j  ,k  )*(yvel0(j  ,k  )+yvel0(j+1,k  ) &
 84                                     +yvel0(j  ,k  )+yvel0(j+1,k  )))*0.25_8*dt*0.5
 85         top_flux=   (yarea(j  ,k+1)*(yvel0(j  ,k+1)+yvel0(j+1,k+1) &
 86                                     +yvel0(j  ,k+1)+yvel0(j+1,k+1)))*0.25_8*dt*0.5
 87         total_flux=right_flux-left_flux+top_flux-bottom_flux
 89         volume_change(j,k)=volume(j,k)/(volume(j,k)+total_flux)
 91         min_cell_volume=MIN(volume(j,k)+right_flux-left_flux+top_flux-bottom_flux &
 92                            ,volume(j,k)+right_flux-left_flux                      &
 93                            ,volume(j,k)+top_flux-bottom_flux)
 95         recip_volume=1.0/volume(j,k)
 97         energy_change=(pressure(j,k)/density0(j,k)+viscosity(j,k)/density0(j,k))*   
 99         energy1(j,k)=energy0(j,k)-energy_chang
 101         density1(j,k)=density0(j,k)*volume_change(j,k
 103       ENDD
 104     ENDD
 106 ...

Compiling the code above targeting a Tesla GPU on a Linux/OpenPOWER IBM Minsky system with the PGI OpenACC Fortran compiler yields the following output from the compiler:

% pgfortran -fast -ta=tesla -Minfo -c PdV_kernel.f90
      66, Generating present(density1(:,:),energy1(:,:),
      76, Loop is parallelizable
      77, Loop is parallelizable
          Accelerator kernel generated
          Generating Tesla code
          76, !$acc loop gang, vector(4) ! blockidx%y threadidx%y
          77, !$acc loop gang, vector(32) ! blockidx%x threadidx%x

The compiler scans the code between the OpenACC KERNELS and END KERNELS directives, determines the loops are parallelizable, and parallelizes the code for execution on a Tesla GPU.

The same code can be compiled for serial execution on any platform by any standard Fortran compiler, or with the PGI compiler on the IBM system the OpenACC directives can be processed to generate parallel code targeting the multicore OpenPOWER CPUs:

% pgfortran -fast -ta=multicore -Minfo -c PdV_kernel.f90
      76, Loop is parallelizable
          Generating Multicore code
          76, !$acc loop gang
      77, Loop is parallelizable
          3 loop-carried redundant expressions removed with 
               9 operations and 9 arrays
          Generated vector simd code for the loop

Cloverleaf compiled for OpenACC parallel execution across all 20 OpenPOWER CPU cores of an IBM Minsky server runs in about 17 seconds. The identical source code compiled for execution on one Tesla Pascal P100 GPU in the same system runs in about 4 seconds, providing a 4x speed-up over multicore CPU execution.

NVLink: Tearing Down the Memory Wall Between CPUs and GPUs

In addition to ease of porting between Linux/x86 and Linux/OpenPOWER platforms, the new PGI compilers enable interoperability of OpenACC and NVIDIA’s CUDA 8.0 Unified Memory features for Pascal GPUs. Specifying the -ta=tesla:managed option to the PGI OpenACC compilers enables this feature, in which most types of allocatable data are placed in CUDA Unified Memory. Movement of these variables and data structures between CPU main memory and GPU device memory is then managed by the CUDA memory manager on a page-by-page basis, rather than by the programmer using OpenACC directives or the compiler runtime system.

Programs developed in this mode can decrease initial development time substantially, as shown in a recent joint webinar presented by NVIDIA and IBM. The chart below shows the performance of the SPEC ACCEL 1.0 OpenACC benchmarks running on one Pascal-based Tesla P100 GPU when compiled using CUDA Unified Memory relative to the performance with user-directed and optimized data movement. On a Minsky system with NVLink between the POWER8 CPUs and Tesla P100 GPUs, the versions of the 15 SPEC ACCEL benchmarks compiled to use CUDA Unified Memory averages within 10% of the versions with user-directed data movement of all allocatable data:


Three of the benchmarks (, 357.csp and use only static data, so the CUDA Unified Memory feature does not apply. The other 12 benchmarks all make substantial use of allocatable data.

“Easier programming methodologies like OpenMP and OpenACC are critical for the widespread adoption of GPU-accelerated systems,” said Sumit Gupta, Vice President of High Performance Computing & Data Analytics, IBM. “The new PGI compilers take advantage of the high-speed NVIDIA NVLink connection between the POWER8 CPU and the NVIDIA Tesla P100 GPU accelerators, along with the page migration engine, to make it much easier to accelerate and enhance performance of high performance computing and data analytics workloads.”

PGI is demonstrating the PGI Accelerator compilers for OpenPOWER in booth 2131 at SC16 in Salt Lake City, Nov. 14–17. Additional information is available and the new PGI compilers are downloadable at


Speeding Up Precision Medicine with Barcelona Supercomputing Center

By Mateo Valero and Enric Banda, Barcelona Supercomputing Center

Barcelona Supercomputing Center joins OpenPOWER

The last decade has seen a worldwide increasing interest in Precision Medicine (PM). As a result, a number of computing platforms have been set up in different centres and countries following different strategies and road maps.

A common challenge that all these initiatives face has to do with the management and analysis of genomic data. For this reason, the improvements and developments around the computing resources devoted to this goal have increased recently. The search for optimal software-hardware relationships to develop robust, efficient and accurate systems and environments for PM is one such example.

Most public administrations have, therefore, paid attention to Precision Medicine either to give it momentum as a key part of biomedical research, such as the US, or to start introducing it into the public health system, like the United Kingdom. In Spain, the recent example comes from the Catalan Government, as recently expressed by the Department of Health in “Catalonia Crafts Strategic Framework for Personalised Medicine” published by the Personalised Medicine Coalition in its fall 2016 issue.

The Barcelona Supercomputing Center has the conditions and skills to be a leading agent in Precision Medicine. Its Life Sciences department has long and successful experience in international genomic research projects such as those promoted by the International Cancer Genome Consortium. Its advanced research groups in high performance computing are specialists in managing big amounts of data, introducing cognitive techniques for its analysis and the development of computational technologies to apply to the most diverse scientific fields. Together, they are constructing hardware-software platforms to optimize the flows and pipelines of genomic variations analysis. It goes without saying that the BSC also has the appropriate infrastructure in terms of computing capacity as well as storage of massive amounts of data.

Together with our experience of working closely with hospitals and clinicians, a fundamental part of the project and the one closest to patient’s interests, this combination makes our centre a perfect ecosystem for the development and application of computational approaches for clinical genomics. A recent competitive call for proposals on PM from the Catalan Government has shown that the BSC is centralizing the computing needs, as it is involved in most projects that are being carried near Barcelona. One of the most active hubs in biomedical research in Europe, the BSC is ready to tackle the opportunity to become a key element in PM projects in Spain.

Needless to say, the complexity of the challenge makes the multi-stakeholder alliance a prerequisite. A platform is being designed within the BSC and will be shortly put in place to bind together the main actors and stakeholders in both research and health care. Having industrial technological partners willing to collaborate on the project is also a sine qua non. This is why BSC decided to join the OpenPOWER Foundation. The complementary knowledge provided by the foundation and the cooperation from IBM, with its new architectures and the huge capacity of IBM Watson, is undoubtedly a valuable asset. Pharmaceutical companies also have an essential role in this science, technology and health chain. Together, they form a chain to be woven as quickly and accurately as the health of the present and future generations deserves.

For more information, see the presentation we recently shared at the OpenPOWER Summit Europe.

Evaluating Julia for Deep Learning on Power Systems + NVIDIA Hardware

By Deepak Vinchhi, Co-Founder and Chief Operating Officer, Julia Computing, Inc. 

Deep Learning is now ubiquitous in the machine learning world, with useful applications in a number of areas. In this blog post, we explore the use of Julia for deep learning experiments on Power Systems + NVIDIA hardware.

We shall demonstrate:

  1. The ease of specifying deep neural network architectures in Julia and visualizing them. We use MXNet.jl, a Julia package for deep learning.
  2. The ease of running Julia on Power Systems. We ran all our experiments on a PowerNV 8335-GCA, which has 160 CPU cores, and a Tesla K80 (dual) GPU accelerator. IBM and OSUOSL have generously provided us with the infrastructure for this analysis.


Deep neural networks have been around since the 1940s, but have only recently been deployed in research and analytics because of strides and improvements in computational horsepower. Neural networks have a wide range of applications in machine learning: vision, speech processing, and even self-driving cars. An interesting use case for neural networks could be the ability to drive down costs in medical diagnosis. Automated detection of diseases would be of immense help to doctors, especially in places around the world where access to healthcare is limited.

Diabetic retinopathy is an eye disease brought on by diabetes. There are over 126.2 million people in the world (as of 2010) with diabetic retinopathy, and this is expected to rise to over 191.2 million by 2030. According to the WHO in 2006, it accounted for 5% of world blindness.

Hence, early automatic detection of diabetic retinopathy would be desirable. To that end, we took up an image classification problem using real clinical data. The data was provided to us by Drishti Care, which is a social enterprise that provides affordable eye care in India. We obtained a number of eye fundus images from a variety of patients. The eyes affected by retinopathy are generally marked by inflamed veins and cotton spots. The following picture on the left is a normal fundus image whereas the one on the right is affected by diabetic retinopathy.



We built MXNet from source with CUDA and OpenCV. This was essential for training our networks on GPUs with CUDNN, and reading our image record files. We had to build GCC 4.8 from source so that our various libraries could compile and link without error, but once we did, we were set up and ready to start working with the data.

The Hardware: IBM Power Systems

We chose to run this experiment on an IBM Power System because, at the time of this writing, we believe it is the best environment available for this sort of work. The Power platform is ideal for deep learning, big data, and machine learning due to its high performance, large caches, 2x-3x higher memory bandwidth, very high I/O bandwidth, and of course, tight integration with GPU accelerators. The parallel multi-threaded Power architecture with high memory and I/O bandwidth is particularly well adapted to ensure that GPUs are used to their fullest potential.

We’re also encouraged by the industry’s commitment to the platform, especially with regard to AI, noting that NVIDIA made its premier machine learning-focused GPU (the Tesla P100) available on Power well before the x86, and that innovations like NVLink are only available on Power.

The Model

The idea is to train a deep neural network to classify all these fundus images into infected and uninfected images. Along with the fundus images, we have at our disposal a number of training labels identifying if the patient is infected or not.

We used MXNet.jl, a powerful Julia package for deep learning. This package allows the user to use a high level syntax to easily specify and chain together large neural networks. One can then train these networks on a variety of heterogeneous platforms with multi-GPU acceleration.

As a first step, it’s good to load a pretrained model which is known to be good at classifying images. So we decided to download and use the ImageNet model called Inception with weights in their 39th epoch. On top of that we specify a simple classifier.

# Extend model as we wish
 arch = mx.@chain mx.get_internals(inception)[:global_pool_output] =>
 mx.Flatten() =>
 mx.FullyConnected(num_hidden = 128) =>
 mx.Activation(act_type=:relu) =>
 mx.FullyConnected(num_hidden = 2) =>
 mx.WSoftmax(name = :softmax)

And now we train our model:
 n_epoch = 100,
 eval_data = test_data,
 callbacks = [
 mx.every_n_epoch(save_acc, 1, call_on_0=false),
 mx.do_checkpoint(prefix, save_epoch_0=true),
eval_metric = mx.MultiMetric([mx.Accuracy(), WMultiACE(2)])

One feature of the data is that it is highly imbalanced. For every 200 uninfected images, we have only 3 infected images. One way of approaching that scenario is to penalize the network heavily for every infected case it gets wrong. So we replaced the normal Softmax layer towards the end of the network with a weighted softmax. To check whether we are overfitting, we selected multiple performance metrics.

However, from our cross-entropy measures, we found that we were still overfitting. With fast training times on dual GPUs, we trained our model quickly to understand the drawbacks of our current approach.

Performance Comparison between CPU and GPU on Training

Performance Comparison between CPU and GPU on Training

Therefore we decided to employ a different approach.

The second way to deal with our imbalanced dataset is to generate smaller, more balanced datasets that contained roughly equal numbers of uninfected images and infected images. We produced two datasets: one for training and another for cross validation, both of which had the same number of uninfected and infected patients.

Additionally, we also decided to shuffle our data. Every epoch, we resampled the uninfected images from the larger pool of uninfected images (and they were many in number) in the training dataset to expose the model to a range of uninfected images so that it can generalize well. Then we started doing the same to the infected images. This was quite simple to implement in Julia: we simply had to overload a particular function and modify the data.

Most of these steps were done incrementally. Our Julia setup and environment made it easy for us to quickly change code and train models and incrementally add more tweaks and modifications to our models as well as our training methods.

We also augmented our data by adding low levels of Gaussian noise to random images from both the uninfected images and the infected images. Additionally, some images were randomly rotated by 180 degrees. Rotations are quite ideal for this use case because the important spatial features would be preserved. This artificially expanded our training set.

However, we found that while these measures stopped our model from overfitting, we could not obtain adequate performance. We explore the possible reason for this in the subsequent section.


Since the different approaches we outlined in the previous section were easy to implement within our Julia framework, our experimentation could be done quickly and these various challenges were easy to pinpoint.

The initial challenge we faced was that our data is imbalanced, and so we experimented with penalizing incorrect decisions made by the classifier. We tried generating a balanced (yet smaller) dataset in the first place and then it turned out that we were overfitting. To counter this, we performed the shuffling and data augmentation techniques. But we didn’t get much performance from the model.

Why is that so? Why is it that a model as deep as Inception wasn’t able to train effectively on our dataset?

The answer, we believe, lies in the data itself. On a randomized sample from the data, we found that there were two inherent problems with the data: firstly, there are highly blurred images with no features among both healthy and infected retinas.

Images such as these make it difficult to extract features

Images such as these make it difficult to extract features

Secondly, there are some features in the healthy images that one might expect to find in the infected images. For instance, in some images the veins are somewhat puffed, and in others there are cotton spots. Below are some examples. While we note that the picture on the left is undoubtedly infected, notice that one on the right also has a few cotton spots and inflamed veins. So how does one differentiate? More importantly, how does our model differentiate?


So what do we do about this? For the training set, it would be helpful to have each image, rather than each patient, independently diagnosed as healthy or infected by a doctor or by two doctors working independently. This would likely improve the model’s predictions.

The Julia Advantage

Julia provides a distinct advantage at every stage for scientists engaged in machine learning and deep learning.

First, Julia is very efficient at preprocessing data. A very important first step in any machine learning experiment is to organize, clean up and preprocess large amounts of data. This was extremely efficient in our Julia environment, which is known to be orders of magnitude faster in comparable environments such as Python.

Second, Julia enables elegant code. Our models were chained together using Julia’s flexible syntax. Macros, metaprogramming and syntax familiar to users of any technical environment allows for easy-to-read code.

Third, Julia facilitates innovation. Since Julia is a first-class technical computing environment, we can easily deploy the models we create without changing any code. Julia hence solves the famous “two-language” problem, by obviating the need for different languages for prototyping and production.

Due to all the aforementioned advantages, we were able to complete these experiments in a very short period of time compared with other comparable technical computing environments.

Call for Collaboration

We have demonstrated in this blog post how to write an image classifier based on deep neural networks in Julia and how easy it is to perform multiple experiments. Unfortunately, there are challenges with the dataset that required more fine-grained labelling. We have reached out to appropriate experts for assistance in this regard.

Users who are interested in working with the dataset and possibly collaborating on this with us are invited to reach out via email to to discuss access to the dataset.


I should thank a number of people for helping me with this work: Valentin Churavy and Pontus Stenetorp for guiding and mentoring me, and Viral Shah of Julia Computing. Thanks to IBM and OSUOSL too for providing the hardware, as well as Drishti Care for providing the data.

Join the SC16 Treasure Hunt!


Calling all Treasure Hunters!

We’re going to give you a chance to be a part of the OpenPOWER Revolution – but you’re going to have to earn it. Guided by our clues, we’ll show you the latest advancements and applications on the OpenPOWER platform.

Use what you discover to solve any three of our five clues below!

Once we verify your answers we’ll reward you for your efforts with a FREE custom designed OpenPOWER T-shirt to serve as a wearable trophy for your successful completion of our Treasure Hunt!

openpower-tshirt-full-mockup-v3cHere is your first clue!

To solve this clue, watch this new video showcasing how OpenPOWER members Kinetica, NVIDIA, and IBM are helping retailers analyze data faster than ever before!

To solve this clue, tell us how much faster Kinetica runs on the IBM-NVIDIA system in the Google Form! 

You’re on your way! Recognize NVIDIA CEO Jen-Hsun Huang? He has the key to the next clue!


Read NVIDIA CEO Jen-Hsun Huang’s blog post, the Intelligent Industrial Revolution, and answer the question in the Google form below!

According to NVIDIA CEO Jen-Hsun Huang, what is IBM’s new POWER8-NVLink server designed to bring?

Almost halfway there! Here’s the third clue in the #HexMarksTheSpot Treasure Hunt!

Hounding for the solution? Visit to try out the OpenPOWER Dog Identification Demo using GPUs! Share a screenshot of your ID’d dog on Twitter using the hashtag #HexMarksTheSpot to complete this clue!

Want to learn more about how deep learning on OpenPOWER and how the demo works? Visit our blog post, Deep Learning Goes to the Dogs.You’re solving this Treasure Hunt so fast you’re making Captain Jack Sparrow jealous! Here’s your fourth clue.


The answer also can be found in OpenPOWER advocate Sumit Gupta’s blog post, “IBM turns POWER HPC momentum up to 11!”

After reading it, use the Google form to tell us how many of Intersect360’s Top 10 HPC Applications are currently supported on OpenPOWER!

Can you see the Hex yet?! Here is your Final Clue!


See what the possibilities are by reading about the new PowerAI Deep Learning package here:

Tell us three of the deep learning distributions supported by the new package in the Google Form.

That’s it, you did it! Complete the Google Form below, including your shipping information. Once we verify your answers, we’ll let you know your a winner and send your t-shirt! Please allow 5-10 business days for processing and shipping.



Share your Treasure Hunt progress with the hashtag #HexMarksTheSpot!

Deep Learning Goes to the Dogs

By Indrajit Poddar, Yu Bo Li, Qing Wang, Jun Song Wang, IBM

These days you can see machine and deep learning applications in so many places. Get driven by a driverless car. Check if your email is really conveying your sense of joy with the IBM Watson Tone Analyzer, and see IBM Watson beat the best Jeopardy player in the world in speed and accuracy. Facebook is even using image recognition tools to suggest tagging people in your photos; it knows who they are!

Barking Up the Right Tree with the IBM S822LC for HPC

We wanted to see what it would take to get started building our very own deep learning application and host it in a cloud. We used the open source deep learning framework, Caffe, and example classification Jupyter notebooks from GitHub, like classifying with ImageNet. We found several published trained models, e.g. GoogLeNet from the Caffe model zoo. For a problem, we decided to use dog breed classification. That is, given a picture of a dog, can we automatically identify the breed? This is actually a class project from Stanford University with student reports, such as this one from David Hsu.

We started from the GoogLeNet model and created our own model trained on the Stanford Dogs Dataset using a system similar to the IBM S822LC for HPC systems with NVIDIA Tesla P100 GPUs connected to the CPU with NVIDIA NVLink. As David remarked in his report, without GPUs, it takes a very long time to train a deep learning model on even a small-sized dataset.

Using a previous generation IBM S822LC OpenPOWER system with a NVIDIA Tesla K80 GPU, we were able to train our model in only a few hours. The IBM S822LC for HPC systems not only features the most powerful NVIDIA Tesla P100 GPUs, but also two IBM POWER8 processors interconnected with powerful NVIDIA NVLink adapters. These systems make data transfers between main memory and GPUs significantly faster compared to systems with PCIe interconnects.

Doggy Docker for Deep Learning

We put our Caffe model and our classification code written in Python into a web application inside a Docker container and deployed it with Apache Mesos and Marathon. Apache Mesos is an open source cluster management application with fine-grained resource scheduling features which now recognize GPUs as cluster-wide resources.

In addition to Apache Mesos, it is possible to run cluster managers, such as Kubernetes, Spectrum Conductor for Containers, and Docker GPU management components, such as nvidia-docker on OpenPOWER systems (see presentation). In addition to Caffe, it is possible to run other popular deep learning frameworks and tools such as Torch, Theano, DIGITS and TensorFlow on OpenPOWER systems.

This lab tutorial walks through some simple sample use cases. In addition, some cool examples can be seen from the results of the recently concluded OpenPOWER Developer Challenge.

This Dog Will Hunt

Our little GPU-accelerated pet breed classification micro-service is running in a Docker container and can be accessed at this link from a mobile device or laptop. See for yourself!

For example, given this image link from a Google search for “dog images”,, we got this correct classification in 0.118 secs:

German Shepard Deep Learning Dogs

You can also spin up your own GPU Docker container with deep learning libraries (e.g. Caffe) in the NIMBIX cloud and train your own model and develop your own accelerated classification example.


Give it a try and share your screenshots in the comments section below!

CAPI SNAP: The Simplest Way for Developers to Adopt CAPI

By Bruce Wile, CAPI Chief Engineer and Distinguished Engineer, IBM Power Systems

Last week at OpenPOWER Summit Europe, we announced a brand-new Framework designed to make it easy for developers to begin using CAPI to accelerate their applications. The CAPI Storage, Network, and Analytics Programming Framework, or CAPI SNAP, was developed through a multi-company effort from OpenPOWER members and is now in alpha testing with multiple early adopter partners.

But what exactly puts the “snap” in CAPI SNAP? To answer that, I wanted to give you all a deeper look into the magic behind CAPI SNAP.  The framework extends the CAPI technology through the simplification of both the API (call to the accelerated function) and the coding of the accelerated function.  By using CAPI SNAP, your application gains performance through FPGA acceleration and because the compute resources are closer to the vast amounts of data.

A Simple API

ISVs will be particularly interested in the programming enablement in the framework. The framework API makes it a snap for an application to call for an accelerated function. The innovative FPGA framework logic implements all the computer engineering interface logic, data movement, caching, and pre-fetching work—leaving the programmer to focus only on the accelerator functionality.

Without the framework, an application writer must create a runtime acceleration library to perform the tasks shown in Figure 1.

Figure 1: Calling an accelerator using the base CAPI hardware primitives

Figure 1: Calling an accelerator using the base CAPI hardware primitives

But now with CAPI SNAP, an application merely needs to make a function call as shown in Figure 2. This simple API has the source data (address/location), the specific accelerated action to be performed, and the destination (address/location) to send the resulting data.

Figure 2: Accelerated function call with CAPI SNAP

Figure 2: Accelerated function call with CAPI SNAP

The framework takes care of moving the data to the accelerator and putting away the results.

Moving the Compute Closer to the Data

The simplicity of the API parameters is elegant and powerful. Not only can source and destination addresses be coherent system memory locations, but they can also be attached storage, network, or memory addresses. For example, if a framework card has attached storage, the application could source a large block (or many blocks) of data from storage, perform an action such as a search, intersection, or merge function on the data in the FPGA, and send the search results to a specified destination address in main system memory. This method has large performance advantages compared to the standard software method as shown in Figure 3.

Figure 3: Application search function in software (no acceleration framework)

Figure 3: Application search function in software (no acceleration framework)

Figure 4 shows how the source data flows into the accelerator card via the QSFP+ port, where the FPGA performs the search function. The framework then forwards the search results to system memory.

Figure 4: Application with accelerated framework search engine

Figure 4: Application with accelerated framework search engine

The performance advantages of the framework are twofold:

  1. By moving the compute (in this case, search) closer to the data, the FPGA has a higher bandwidth access to storage.
  2. The accelerated search on the FPGA is faster than the software search.

Table 1 shows a 3x performance improvement between the two methods. By moving the compute closer to the data, the FPGA has a much higher ingress (or egress) rate versus moving the entire data set into system memory.

 POWER+CAPI SNAP FrameworkSoftware-only
Ingest 100GB of DataTwo 100Gb/s ports: 4 seconds1 PCI-E Gen3 x8 NIC: 12.5 seconds
Perform Search<1us<100us
Results to System Memory<400ns0
Total4.0000014 seconds12.50001 seconds

Simplified Programming of Acceleration Actions

The programming API isn’t the only simplification in CAPI SNAP. The framework also makes it easy to program the “action code” on the FPGA. The framework takes care of retrieving the source data (whether it’s in system memory, storage, networking, etc) as well as sending the results to the specified destination. The programmer, writing in a high-level language such as C/C++ or Go, needs only to focus on their data transform, or “action.” Framework compatible compilers translate the high-level language to Verilog, which in turn gets synthesized using Xilinx’s Vivado toolset.

With CAPI SNAP, the accelerated search code (searching for one occurrence) is this simple:

for(i=0; i < Search.text_size; i++){

                                  if ((buffer[i] == Search.text_string)) {

                                                Search.text_found_position = i;



The open source release will include multiple, fully functional example accelerators to provide users with the starting points and the full port declarations needed to receive source data and return destination data.


Are you looking to explore CAPI SNAP for your organization’s own data analysis? Then apply to be an early adopter of CAPI SNAP by emailing us directly at Be sure to include your name, organization, and the type of accelerated workloads you’d like to explore with CAPI SNAP.

You can also read more about CAPI and its capabilities in the accelerated enterprise in our CAPI series on the OpenPOWER Foundation blog.

You will continue to see a drumbeat of activity around the framework, as we release the source code and add more and more capabilities in 2017.



OpenPOWER Makes FPGA Acceleration a “SNAP”

By Bruce Wile, CAPI Chief Engineer and Distinguished Engineer, IBM

Improving on the CAPI Base Technology

In the datacenter, metrics matter.  Competition between application providers is fierce, with pressure to provide benchmarks that show continued competitive advantages in performance, price, and power.  Application level improvements rode the Moore’s law performance improvement curve for decades, and now require accelerator innovations to deliver the performance gains needed to maintain current clients and win new business.  FPGA acceleration has long been an option, but the difficult programming model and specialized computer engineering skills hindered FPGAs in mainstream datacenters.

The biggest companies see this trend and have put significant resources into FPGA integration into the datacenter. But enabling FPGA acceleration for the masses has been a challenge. OpenPOWER’s Acceleration Workgroup is changing that.


The CAPI infrastructure, introduced on POWER8 in 2014, provides the technology and ecosystem foundation to enable datacenter applications to integrate with FPGA acceleration.  The technology base has everything needed to support the datacenter—virtualization (for multiple simultaneous context calls), a threaded model (for programming ease), removal of the device driver overhead (performance enablement), and an open ecosystem (for the masses to build upon).

As a result, FPGA experts around the world created CAPI accelerators, many of which are listed at These are creative, compelling acceleration algorithms that open doors to capabilities previously beyond reach.


Check out “Facial analysis for emotion detection” ( from SiliconScapes for a slick example.

But there’s still a skills gap between the FPGA experts (computer engineers) and the programming experts working for most Independent Software Vendors (ISVs).  For FPGAs to deliver on their promise of higher performance at lower cost and lower power, we need further enablement for ISVs to embrace FPGA acceleration.

“Extending the capability of the CAPI device will offer our engineers and ultimately our users with more options for working efficiently with complex connected data,” explains Philip Rathle, VP of Products at OpenPOWER member Neo4j.

Accelerating Acceleration

Enter OpenPOWER and the Accelerator Workgroup.  At April 2016’s OpenPOWER Summit, multiple companies agreed to create a framework around CAPI. Two significant directives drove the work effort that followed:

  1. The framework would make it easy for programmers to call accelerators and write their own acceleration IP.
  2. The framework would be open source to enable continued enhancements and cross company collaboration.

Collaboration grew for building the framework, with significant contributions from IBM, Xilinx, Rackspace, Eiditicom,, Alpha-Data, and Nallatech.  Each company brought unique skills and perspectives to the effort, with a common goal of releasing the first version of the open source framework by the end of 2016.


Bringing Developers CAPI in a SNAP!

Today, at OpenPOWER Summit Europe, we are announcing the CAPI Storage, Networking, and Acceleration Programming Framework, or CAPI SNAP Framework.  The framework fulfills the initial visions of the team, and will grow beyond the first release.  Upon release, the framework, including source code, will be available for anyone to try via github.

The framework is key for developers or anyone else looking to bring the power of FPGA acceleration to their data center. CAPI SNAP will:

  • Make it easy for developers to create new specialized algorithms for FPGA acceleration in high-level programming languages, like C++ and Go, instead of less user-friendly languages like VHDL and Verilog.
  • Make FPGA acceleration more accessible to ISVs and other organizations to bring faster data analysis to their users.
  • Leverage the OpenPOWER Foundation ecosystem to continually drive collaborative innovation.

Levyx Chief Business Development Officer Bernie Wu already sees how CAPI SNAP can make an impact for the ISV. “Levyx is focused on accelerating Big Data Analytical and Transactional Operations to real-time velocities. The CAPI SNAP Framework will allow us to bring processing even closer to the data and simplify the programming model for this acceleration,” adding “we see the CAPI SNAP capability being used to initially boost or enable rich real-time analytics and stream processing in variety of increasingly Machine to Machine driven use cases.”

Learn More and Try CAPI SNAP for Yourself!

For those interested in the CAPI SNAP Framework, we encourage you to watch for announcements at the OpenPOWER Summit Europe.  You can also read more about CAPI and its capabilities in the accelerated enterprise in our CAPI series on the OpenPOWER Foundation blog.

Are you looking to explore CAPI SNAP for your organization’s own data analysis? Then apply to be an early adopter of CAPI SNAP by emailing us directly at Be sure to include your name, organization, and the type of accelerated workloads you’d like to explore with CAPI SNAP.

You will continue to see a drumbeat of activity around the framework, as we release the source code and add more and more capabilities in 2017.

Additional CAPI SNAP Reading from OpenPOWER Members


Barcelona Supercomputing Center Adds HPC Expertise to OpenPOWER

Eduard Ayguadé, Computer Sciences Associate Director at BSC

Barcelona Supercomputing Center joins OpenPOWER

The Barcelona Supercomputing Center (BSC) is Spain’s National Supercomputing facility. Our mission is to investigate, develop and manage information technologies to facilitate scientific progress. It was officially constituted in April 2005 with four scientific departments: Computer Sciences, Computer Applications in Science and Engineering, Earth Sciences and Life Sciences. In addition, the Center’s Operations department manages MareNostrum, one of the most powerful supercomputers in Europe. The activities in these departments are complementary to each other and very tightly related, setting up a multidisciplinary loop: computer architecture, programming models, runtime systems and resource managers, performance analysis tools, algorithms and applications in the above mentioned scientific and engineering areas.

Joining the OpenPOWER foundation will allow BSC to advance its mission, improving the way we contribute to the scientific and technological HPC community, and at the end, serve society. BSC plans to actively participate in the different working groups in OpenPOWER with the objective of sharing our research results, prototyping implementations and know-how with the other members to influence the design of future systems based on the POWER architecture. As member of OpenPOWER, BSC hopes to gain visibility and opportunities to collaborate with other leading institutions in high performance architectures, programming models and applications.

In the framework of the current IBM-BSC Deep Learning Center initiative, BSC and IBM will collaborate in research and development projects on the Deep Learning domain, an essential component of cognitive computing, with focus on the development of new algorithms to improve and expand the cognitive capabilities of deep learning systems. Additionally, the center will also do research on flexible computing architectures –fundamental for big data workloads– like data centric systems and applications.

Researchers at BSC have been working on policies to optimally manage the hardware resources available in POWER-based systems from the runtime system, including prefetching, multithreading degree and energy-securing. These policies are driven by the information provided by the per-task (performance and power) counters available in POWER architectures and control knobs. Researchers at BSC have also been collaborating with the compiler teams at IBM in the implementation and evolution of the OpenMP programming model to support accelerators, evaluating new SKV (Scalable Key-Value) storage capabilities on top of novel memory and storage technologies, including bug reporting and fixing, using Smufin, one of the key applications at BSC to support personalized medicine, or exploring NUMA aware placement strategies in POWER architectures to deploy containers based on the workloads characteristics and system state.

Today, during the OpenPOWER Summit Europe in Barcelona, the director of BSC, Prof. Mateo Valero, will present the mission and main activities of the Center and the different departments at the national, European and international level. After that, he will present the work that BSC is conducting with different OpenPOWER members, including IBM, NVIDIA, Samsung, and Xilinx, with a special focus on the BSC and IBM research collaboration in the last 15 years.

Advancing the Human Brain Project with OpenPOWER

By Dr. Dirk Pleiter, Research Group Leader, Jülich Supercomputing Centre

Human Brain Project and OpenPOWER members NVIDIA, IBM

The Human Brain Project (HBP), a flagship project funded by the European Commission, has set itself an ambitious goal: Unifying our understanding of the human brain. To achieve it, researchers need a High-Performance Analytics and Compute Platform comprised of supercomputers with features that are currently not available, but OpenPOWER is working to make them a reality.

Through a Pre-Commercial Procurement (PCP) the HBP initiated the necessary R&D, and turned to the OpenPOWER Foundation for help. During three consecutive phases, a consortium of IBM and NVIDIA has successfully been awarded with R&D contracts. As part of this effort, a pilot system called JURON (a combination of Jülich and neuron) has been installed at Jülich Supercomputing Centre (JSC). It is based on the new IBM S822LC for HPC servers, each equipped with two POWER8 processors and four NVIDIA P100 GPUs.

Marcel Huysegoms, a scientist from the Institute for Neuroscience and Medicine, with support from the JSC could demonstrate soon after deployment the usability of the system for his brain image registration application. Exploiting the processing capabilities of the GPUs without further tuning, could achieve a significant speed-up compared to the currently used production system based on Haswell x86 processors and K80 GPUs.

Not only do the improved compute capabilities matter for brain research, but also by designing and implementing the Global Sharing Layer (GSL), the non-volatile memory cards mounted on all nodes became a byte addressable, globally accessible memory resource. Using JURON it could be shown that data can be read at a rate that is only limited by network performance. These new technologies will open new opportunities for enabling data-intensive workflows in brain research, including data visualization.

The pilot system will be the first system based on POWER processors where graphics support is being brought to the HPC node. In combination with the GSL it will be possible to visualize large data volumes that are, as an example, generated by brain model simulations. Flexible allocation of resources to compute applications, data analytics and visualization pipelines will be facilitated through another new component, namely the dynamic resource management. It allows for suspension of execution of parallel jobs for a later restart with a different number of processes.

JURON clearly demonstrates the potential of a technology ecosystem settled around a processor architecture with interfaces that facilitate efficient integration of various devices for efficient processing, moving and storing of data. In other words, it demonstrates the collaborative potential of OpenPOWER.

Innovation Unleashed: OpenPOWER Developer Challenge Winners Announced

By John Zannos, Chairman, OpenPOWER Foundation

Developers play an integral role within the OpenPOWER ecosystem, and we’re passionate about welcoming them and their creative minds into our growing community. That’s why we decided to embark on our very first OpenPOWER Developer Challenge this past spring. When we made the call for entries, we were thrilled to see over 300 individuals seize this opportunity to tap into the accessible OpenPOWER platform.

As part of the Challenge, we provided developers access to the SuperVessel Developer Cloud, along with hands-on guidance to innovative hardware acceleration technologies, advanced development tools, and programming frameworks, enabling them to optimize and accelerate their applications. Working within the OpenPOWER ecosystem, participants were challenged to build or transform their passion projects, finding new ways to accelerate applications in need of a performance boost.  They became contenders as individuals or teams in three courses: The Open Road Test, The Accelerated Spark Rally, and The Cognitive Cup.

And now, after months of forward thinking, collaboration and propelling innovative technologies ever forward, I am excited to announce the four winners of the inaugural OpenPOWER Developer Challenge!


  • Scaling Up And Out A Bioinformatics Algorithm, by Zaid Al-Ars, Johan Peltenburg, Matthijs Brobbel, Shanshan Ren, and Tudor Voicu from Delft University of Technology – Winner of the The Open Road Test
  • EmergencyPredictionOnSpark, by Antonio Carlos Furtado from the University of Alberta – Winner of The Accelerated Spark Rally
  • artNET Genre Classifier, by Praveen Sridhar and Pranav Sridhar – A two-way tie in The Cognitive Cup
  • DistributedTensorFlow4CancerDetection, by Alexander Kalinovsky, Andrey Yurkevich, Ksenia Ramaniuk, and Pavel Orda from Altoros Labs – A two-way tie in The Cognitive Cup

We spoke with each of the winners ahead of the OpenPOWER Summit Europe for insight on their experience and key takeaways from our inaugural Challenge. Here’s what our winning developers learned and what inspired their innovative applications.

Scaling Up And Out A Bioinformatics Algorithm

In addition to further developing an application that advances precision medicine, the engineers at Delft University of Technology acquired valuable skills both on a technical and team building level. As the team continues to work to further build the application, they are optimistic that working with the OpenPOWER Foundation will create a valuable network of partners to further collaborate and grow.

What was the inspiration for your application? For a couple of years now, our group in the TUDelft has been actively working to address the computational challenges in DNA analysis pipelines resulting from Next Generation Sequencing (NGS) technology, which brings great opportunities for new discoveries in disease diagnosis and personalized medicine. However, due to the large size of used datasets, it would take a prohibitively long time to perform NGS data analysis. Our solution is targeted to combine scaling with high-performance computer clusters and hardware acceleration of genetic analysis tools to achieve an efficient solution. – Zaid Al-Ars, Assistant Professor at the Delft University of Technology and co-Founder of Bluebee


Antonio Carlos Furtado was able to develop an emergency call prediction application through the OpenPOWER Developer Challenge, bringing himself up to speed with the OpenPOWER environment for the first time and then trying out different approaches to implementing his Big Data Analytics application. He is interested in exploring new features in deep learning and excited to get a glimpse of what is new in terms of high-performance computing at SC16.

What did you learn from the Challenge?

I learned more from the OpenPOWER Developer Challenge than what I usually learn after taking a four-month course at the university. The most useful thing I learned was probably the functional programming paradigm. As with most programmers, I am more familiar with the imperative programming paradigm. At some point during the Challenge, I realized that I would have to get myself familiarized with Scala programming language and functional programming to get my project completed in time. The main goal of the project was to use Apache Spark to scale the training of a deep learning model across multiple nodes. When learning about Apache Spark, I found that not only are there more resources for Scala, but it is also the best way to use it. I enjoyed programming in Scala so much that I continue learning it and using it even today. – Antonio Carlos Furtado, MSc Student at University of Alberta and Developer at wrnch

artNET Genre Classifier

Developers Praveen Sridhar and Pranav Sridhar were intrigued by the differentiated compute facilities provided to applicants. Initially, joining the Challenge was about testing the technologies provided on their art genre classifier; however, it transformed into absorbing and understanding deep learning through participation, which is imperative for long-term application development.

Why did you decide to participate in the OpenPOWER Developer Challenge?

I was fascinated by the fact that such awesome compute facilities were being provided by OpenPOWER for Developer Challenge participants. I initially just wanted to try out what was being provided, but once I realized its power, there was no stopping. I practically learned Deep Learning by participating in this Challenge. – Praveen Sridhar, Freelance Developer and Data Analyst


Altoros Labs found that combining the rapidly developing challenge space involving automated cancer detection using TensorFlow with the robust and proven platform offered through the OpenPOWER Developer Challenge led to amazing results. The developers are expecting the beta version of the application to be launched in a few months, and Altoros Labs will continue to utilize the OpenPOWER community to strengthen the application.

Why did you decide to participate in the OpenPOWER Developer Challenge?

Exploring TensorFlow has been one of our R&D focuses recently. We also knew that POWER8 technology is good at enhancing big data computing systems. Our team liked the idea of bringing the two solutions together, and the Challenge was a great opportunity to do so. Even though it was the first time we tried to participate in this kind of challenge, we got promising results and are going to continue with experiments. – Ksenia Ramaniuk, Head of Java & PHP Department at Altoros

 Putting advanced tools at the fingertips of some of the most innovative minds is powering the growing open technology ecosystem, and the OpenPOWER Foundation is pleased to be a part of the progression. We’ll continue to place great importance on encouraging developer-focused collaborations and innovations that are capable of impacting the industry.

Help Build the Next Great OpenPOWER Application

Join the Grand Prize winners with IBM and OpenPOWER at SC16 in Salt Lake City, November 15-19. Hear first-hand their experiences and see full demos of their winning applications at the IBM booth.

Are you ready to get started on your OpenPOWER application? Check out our new Linux Developer Portal at Think your application idea is good enough to win the OpenPOWER Developer Challenge? Then be sure to follow to get updates on next year’s Challenge!


New Physical Science Work Group Addresses Physics, Chemistry, and more with OpenPOWER

By Andrea Bulgarelli, Chair, OpenPOWER Physical Science Work Group

As the application of OpenPOWER technology expands, so too must the OpenPOWER Foundation continue to explore workloads demanded by the market that best leverage our technology. In pursuit of that, the OpenPOWER Foundation is pleased to announce the formation of the new Physical Science Work Group.

The Physical Science Work Group is a persistent work group focused on establishing an OpenPOWER Foundation interface between their members and the Physical Science community. This Work Group aims at addressing the challenges of Physical Science projects by developing use cases, identifying requirements and extracting workflows.

Applying OpenPOWER to Physical Sciences

We made the decision to create this work group to understand how the OpenPOWER ecosystem can help physical science projects. Today the scientific community (from Big Science projects to a single laboratory) is facing an enormous increase in data volume, rate and dimensionality from experiments, and computational science.

There are two main projects that will be addressed by the work group:

(1) Current and future Physical Science projects use cases, requirements, common workflows and reference solutions. Based on these requirements, identification of common workflows and possible reference solutions in collaboration with other OpenPOWER Foundation Workroups.

(2) Scientific software frameworks and libraries. Identification of widely used software frameworks and libraries used in the Physical Science, the status of the porting to OpenPOWER solutions.

Another important point is to focus hardware/software developer around physical science projects requirements that are not covered by current solutions.

An Open Approach

Working around use cases, the WG allows the OpenPOWER Foundation to be a forum between scientists and technical solutions developers, and also between scientists of different fields and projects to share experience and solutions with each other. The participation to this work group is open also for non-OpenPOWER member, to help to open the discussion within the Physical Science community around the OpenPOWER technology. For the same reason contributions and feedback are not subject to any requirement of confidentiality. The deliverables and their reviews are public. This will help to collect feedback from all interested people, not only OpenPOWER Foundation members.

Learn more at OpenPOWER Summit Europe

This Friday, at the OpenPOWER Foundation Summit Europe, I will explain what the Physical Science Work Group is, why it was important for the Foundation to start it, and what are some of the workloads/problems the work group will work to address. To learn more about the new work group and others that are exploring the potential and use of OpenPOWER technology, please visit

Making Unforgettable MRAM Memory with OpenPOWER

By Adam McPadden, Lead Engineer, Burlington Systems Lab, IBM

One of the key tenets of the OpenPOWER Foundation’s collaborative model is that having open systems and published interfaces allows people to create innovative architectures at all different areas of the system, including ones where there hasn’t been much change in decades like memory.

In validation of this approach, OpenPOWER members IBM, and Everspin have demonstrated a new way for OpenPOWER members to improve application performance with STT-MRAM on the memory bus of a POWER8 server.

STT-MRAM is included within a broad memory classification commonly referred to as Storage Class Memory (SCM) whose performance attributes lie between traditional main memory DRAM and FLASH Storage while offering the benefit of non-volatility, retaining their data without power. Typically, applications cannot process data until it is loaded into memory from storage, causing a performance bottleneck.  With SCM, this is not necessary, the data always stays in memory resulting in much faster application performance. Various types of SCM offer benefits over traditional memory.  STT-MRAM offers non-volatility at DRAM-like speeds with endurance 10^6 better than NAND FLASH, while PCM and ReRAM offer higher capacity than DRAM and faster speeds than FLASH.

SCM technologies, such as PCM, ReRAM and STT-MRAM, have been around for many years with the promise of faster system performance achievable by having non-volatility on the memory bus.  Unfortunately, due to scaling challenges and complex materials, scalable production volume SCM  has been slow to develop..

IBM, long realizing the performance potential of systems with SCM, dedicated teams of engineers and scientists from IBM Research and the Systems Development Lab to enable these new memory technologies in the POWER system architecture over the past two years, opening up a new opportunity for the OpenPOWER community to innovate with production level SCM technology as a viable media leveraging attach points such as CAPI, OpenCAPI and NVMe. SCM technologies will now allow OpenPOWER Foundation members the ability to combine high performance media with low latency and high bandwidth interfaces on the POWER architecture to achieve performance benefits beyond traditional FLASH.

“New advanced memory technologies will have a disruptive impact on the industry.  This demonstration of MRAM in a POWER8 server running real applications is a great example of what OpenPOWER is all about – creating opportunities for industry partners to innovate and enabling choice in the market,” explains Steve Fields, IBM Fellow and Chief Engineer of POWER systems.

Figure 1: IBM's Con Tutto Platform

Figure 1: IBM’s Con Tutto Platform

Driving Memory Performance with Con Tutto

Enabling new memory technologies required IBM and its partners to develop a prototyping platform which would allow non-DRAM technologies to run at full bus speeds in their POWER8 server. This platform, named Con Tutto, combines FPGA flexibility with at-speed memory bus compatibility. The Con Tutto card allows POWER8 users to develop the software stack necessary for persistent memory support and better understand the system level characteristics associated with various SCM technologies today.

Figure 2: Storage Class Memory Latency

Figure 2: Storage Class Memory Latency

High performance technologies such as STT-MRAM on the system memory bus offer a low latency attach point for applications to leverage persistent memory with direct access (DAX) from the application.  The performance value of SCM in a server depends heavily on the technology and implementation specifics.  Leveraging the Con Tutto card with STT-MRAM, in-system test results show up to 97% lower latency and 20X higher bandwidth when compared to a current generation FLASH NVMe card, and we are working to make this even faster.

Accelerating Applications with Unforgettable Memory

IBM has partnered with Everspin Technologies to demonstrate their first production level pMTJ (Perpendicular Magnetic Tunnel Junction) STT-MRAM chips in a high performance S824L server seen in Figure 3, leveraging the lower power, higher performance offered by this architecture.

Figure 3: IBM S824L Server running STT_MRAM on the Memory Bus

Figure 3: IBM S824L Server running STT_MRAM on the Memory Bus

While this STT-MRAM solution is in production, its capacity to date has limited broad usage to applications which need the benefits of non-volatility, high performance but do not need high capacity (write caching, journaling, etc).  The announcement of a 1Gb chip by Everspin will improve the viability for broader use cases.  SCM technologies such as ReRAM, PCM and others will expand the application value proposition of persistent memory as they become mature.

Learn More at OpenPOWER Summit Europe in Barcelona

IBM and Everspin will be showcasing this new solution in an application demo at the OpenPOWER Summit Europe, building on a previous demo shown at the 2016 OpenPOWER Summit in San Jose, CA, where IBM engineers and scientists were the first to demonstrate production level STT-MRAM on the memory bus of a POWER8 server using IBM’s DMI (Differential Memory Interface) bus.  In the demo, you’ll see the performance benefits of combining a high performance SCM and a low latency bus on key business applications.

You can also learn more about Con Tutto by visiting these links on the OpenPOWER Foundation:

The Revolution Comes to Europe!

By Amanda Quartly, OpenPOWER Alliances Europe, IBM


The only constant to being involved with the OpenPOWER Foundation is change and innovation, and there is plenty happening! For instance, OpenPOWER members IBM and NVIDIA just launched a new set of servers built for the cognitive and AI-driven age. Now it’s time for the focus to turn to Europe with the upcoming OpenPOWER European Summit, beginning 26 October through 28 October.

Our membership and activities in Europe have continued to grow along with our efforts all over the world! This is your chance to find out the latest and hear the latest announcements from our members on how they’re driving the OpenPOWER ecosystem.

Register for free today to join us for:

    • 20 keynotes from STFC Hartree Centre, AT&T, OpenStack, Kolab, NVidia, Mellanox, Kinetica, E4 and more to be announced.
    • 22 breakout sessions featuring OPF members and OpenStack on OpenPOWER demonstrations.
    • Numerous workgroup, Birds of a Feather and Panel sessions.
    • The Rebel Alliance Reception on Thursday night to network with other OpenPOWER revolutionaries!

We will highlight OpenPOWER adoption stories, new European members and new innovations based on OpenPOWER systems. Plus hear from developers and ISVs on what they’re doing, and be there for the announcement of the winners of the inaugural OpenPOWER Developer Challenge. And to top it all off, attendance is free!

For more details visit the official OpenPOWER Summit Europe page here:


Register for free on our Eventbrite:


For sponsorship opportunities, fill out the Sponsor Application:


The OpenPOWER Foundation is pleased to be working with the OpenStack Summit. As we market these events together we recommend that you purchase a Full Pass or the Keynote and Markeplace Pass to be able to attend the OpenStack Summit. You can purchase an OpenStack pass on their webpage:

OpenPOWER Developer Challenge Finalists Announced

By Calista Redmond, President, OpenPOWER Foundation

Recently at IBM Edge I had the pleasure of announcing the Finalists of the 2016 OpenPOWER Developer Challenge.   From the kick-off of this global challenge in the Spring – a first-ever for the OpenPOWER Foundation – to the many MeetUps and Google Hangouts where we met Developers and Challenge participants around the world, it’s been a fantastic journey, and it’s not over yet.

Hundreds of Developer Challenge participants worked throughout the summer using the SuperVessel Developer Cloud to port, optimize, accelerate and scale HPC, Big Data & Analytics and Deep Learning applications on OpenPOWER.  They had access to hardware acceleration technologies including NVIDIA GPUs and FPGAs from Xilinx, advanced development tools and programming frameworks including the IBM XL Compilers and the Linux on Power SDK, and programming frameworks like Apache Spark and the OpenPOWER Deep Learning Software Distribution.

The six projects qualifying as Finalists are:

All six of these projects will be awarded either a Grand, 2nd or 3rd Prize – stay tuned for the Grand Prize and rankings announcement during the upcoming OpenPOWER Foundation Summit in Barcelona.  Finally, plan to join the Grand Prize winners with IBM and OpenPOWER at SC16 in Salt Lake City.

In 2016 Developers became the stars of the OpenPOWER Foundation, and this is just the beginning! Want to learn more about developing on Power? Visit the new Linux on Power Developer Portal.

Recap: CDAC Three Day Workshop on OpenPOWER for HPC and Big Data Analytics

By Dr. VCV Rao, Centre for Development of Advanced Computing


Recently, the Centre for Development of Advanced Computing (CDAC) in India held a three-day workshop where presenters from various industries examined the progress and opportunity to leverage OpenPOWER technology. The objective of the workshop was to understand performance and scalability of high performance computing (HPC) application kernels and Big Data processing, and data science applications on RISC-based IBM POWER8 systems with GPUs as a part of the OpenPOWER Foundation.

Representatives from IBM, Mellanoax and CDAC discussed the POWER8 architecture, application performance compared to x86 systems, and how easily we can port the applications running on x86 to POWER8. They also discussed the Power architecture’s roadmap, looking ahead to updates and enhancements. Finally, we discussed accelerator technology like GPUs and FPGAs. With technology like NVIDIA NVLink and CAPI, the Foundation is very well positioned to harness the power of acceleration.

In the three-day agenda workshop, we learned a lot about high performance computing, in particular how to make use of NVIDIA GPUs in parallel programming to improve the performance of the HPC applications. We also discussed how we can achieve greater bandwidth by using Mellanox Interconnect and expand our capabilities in FPGA programming.

A lot of time was spent discussing how Big Data applications can scale with POWER8 and GPUs. To answer that question, this workshop provided us with lots of different compiler toolkits, various kinds of libraries, like CUDA for GPUs, and writing and testing parallel code like MPI and OpenMP.

CDAC is dedicated to advancing Supercomputing research and workshops like these help us to bring together discussion around many important topics. To learn more about CDAC and our work in the OpenPOWER Foundation, join us at future workshops by registering on our Events Page.  We look forward to the next workshop. Let us know what you would like to see on the agenda in the comments.


OpenPOWER Members Bring Rackspace-Led Open Compute Barreleye Server to Market

By Sam Ponedal, Social Strategist, OpenPOWER Foundation


Back in March, we told you about how OpenPOWER members StackVelocity, Mark III Systems, and Penguin Computing had adopted Rackspace’s Barreleye server design. Today,we are pleased to relay the news from IBM Edge that our members have reached the next milestone in their journey, and have released their Barreleye server designs. Let’s take a look at all the Barreleye news from our members:


Originally published on by Aaron Sullivan

Barreleye is available from multiple outlets, to suit many kinds of consumers. From solution providers who specialize in hyperscale, to high performance computing, to the IBM business partner network, you can purchase Barreleye from a company that understands your business.

Barreleye works with a variety of Linux distributions and KVM hypervisors. It has chassis options for those who like to keep their storage high capacity, in-box and powerful, or light-weight and low-cost. It is configurable for basic low-cost networking, or very high-throughput networking. And for a server with such a low mechanical profile, it has great PCI adapter capacity.

If you want to test drive it, but don’t have an Open Rack handy, there’s a simple-to-use benchtop power supply (called Lunchbox) we developed along with Barreleye. Here are a few other things that Barreleye does:

  • Leverages one of the most powerful 2-socket servers on the planet.
  • Gets your organization closer to the cutting edge of open hardware development.
  • Makes a clear statement to your suppliers: you expect more freedom, value and influence.

Mark III Systems

Originally published on by Andy Lin

Today at IBM Edge 2016, Mark III and our partners in the OpenPOWER Foundation are announcing the immediate availability of an OpenPOWER server platform based on the Barreleye Open Compute Project (OCP) design.

We’re doing this announcement specifically in partnership with Penguin Computing under the OCP-compatible model of the Penguin Magna 1015, which provides an enterprise supported version of the Barreleye system.  As a long-time IBM Premier Business Partner with two decades of experience with POWER, our strong team of engineers are also available to offer their expertise and services around the Magna 1015 platform to ensure that our joint OpenPOWER clients are successful.



If you might recall, Barreleye is based on the Rackspace-led OCP design that incorporates OpenPOWER technologies (including POWER8 processors), and was a system that Mark III announced back in March at the OCP Summit that it would be offering very soon.

We view the Magna 1015 (Barreleye) as fitting a key niche in our portfolio of OpenPOWER platforms, as many hyperscale users of compute have looked at or are starting to look at OCP approaches to maximizing datacenter efficiency as they grow.

As a member of both foundations, we’re very excited about the future of both OpenPOWER and OCP in delivering highly efficient architectures for the bandwidth-intensive workloads of the next decade.  To us, Barreleye is the culmination of both these industry movements, but is also just the beginning of a new wave of innovation.

Penguin Computing

Originally published on 

Penguin Computing, a provider of high performance computing, enterprise data center and cloud solutions, today announced immediate availability of Penguin Magna 1015, an OpenPOWER based system for cloud and hyperscale data center environments.

Based on the “Barreleye” platform design pioneered by Rackspace and promoted by the OpenPOWER Foundation and the Open Compute Project (OCP) Foundation, Penguin Magna 1015 targets memory and I/O intensive workloads, including high density virtualization and data analytics. The Magna 1015 system uses the Open Rack physical infrastructure defined by the OCP Foundation and adopted by the largest hyperscale data centers, providing operational cost savings from the shared power infrastructure and improved serviceability.

“Penguin is all about open technologies and offering choice of platforms for the customer application”, said Jussi Kukkonen, Director, Product Management, Penguin Computing. “Penguin’s partnership with Mark III provides our customers with a unique combination of comprehensive OCP server, storage and networking catalog together with OpenPOWER architecture and applications expertise.”

“As a fellow member of the OpenPOWER Foundation, Mark III is excited to be working with Penguin Computing on OCP solutions enabled with OpenPOWER technologies,” said Andy Lin, Vice President of Strategy, Mark III Systems. “We believe that an OCP compatible system powered by OpenPOWER processors presents a truly unique value proposition for hyperscale users of compute looking for a differentiated platform to efficiently run and scale high-bandwidth workloads, including big data analytics, HPC, and cloud.”


Originally published on by Doug Taylor

The Open Power Foundation (OPF) stands to become a significant complement to OCP. The IBM POWER architecture, which is well known in the industry as the performance leader, has moved to an open licensing model. Through the OPF, an ecosystem of chip companies, board manufacturers, networking vendors, etc., are all driving innovation to create the next generation of Web 2.0 compute platforms that are open.

As testament to how well OPF and OCP foundations work together, Doug Balog, General Manager for POWER Systems at IBM, announced today at the IBM EDGE event that Barreleye is ready for mass production and available for purchase. Barreleye is a powerful and highly efficient server built with OpenPOWERTM technologies and delivered through the Open Compute Foundation. StackVelocity is excited to be collaborating with the Open community by bringing Barreleye to market.

StackVelocity is able to complement the performance of OpenPOWER with our very own high-density OCP storage platform called HatTrick Storage. The HatTrick Storage platform delivers up to 15 LFF drives in the same form factor as a “Winterfell/Leopard” server, allowing up to 45 LFF drives in 2 OU—that’s a 50% increase in density over the currently available solutions. It provides substantial capacity in an extremely efficient footprint and can be configured to match any workload.

For those customers that also need a standard EIA 19” OpenPOWER solution, we have a high-performance platform called Saba that features OpenPOWER Power8TM processors to tackle the challenge of extracting value from mass amounts of information. Saba can support up to 1TB of memory and 24 SFF drives. This means massive amounts of information are brought to compute resources in real time and business insight is maximized.

These building blocks provide the core from which we can help our customers tailor OpenPOWER solutions that fit their unique business needs.


OpenPOWER Helps India Advance National Supercomputing Mission with new Research Facility at IIT Bombay

By Professor P.S.V. Nataraj, Systems and Control Engineering Group, IIT Bombay


During my visit to IBM, Bangalore in April 2014, the idea for having a collaboration between the OpenPOWER Foundation and IIT Bombay (IITB) was born. The OpenPOWER Foundation’s representative, Ganesan Narayanasamy, presented the genesis, objectives, and activities of the Foundation to Prof. Nataraj, and from this conversation IIT Bombay joined the Foundation as an academic member. Continue reading

OpenPOWER Host OS Repository Launches on GitHub!

By Ricardo Marin Matinata, Linux Architect, KVM and Cloud on POWER, IBM

The initial version 0.5 (beta) of the OpenPOWER HostOS repository is available!

As new OpenPOWER hardware features and servers are developed by multiple partners, it becomes a challenge to deploy them in an OS environment that leverages a known and tried base and, at the same time, allows for the flexibility that is required to support the diversity of requirements. To address this challenge, IBM is launching a new collaboration model: an open community for OpenPOWER hardware enablement and features that is built on top of a reference Host OS/KVM for the Power architecture.

Through this community, IBM and OpenPOWER are providing an open source repository that is seeded with the the core elements, allowing OpenPOWER partners to build and validate their own deliverables. This repository includes the core kernel as well as other key component pieces to enable KVM virtualization, along with build scripts and a validation suite.  These components enable members of the OpenPOWER ecosystem to build their own Host OS with the optional support of KVM on Power and, most importantly, allow them to contribute back to this community. Even further, an additional usage model that this repository provides is an abstraction layer that is based upon KVM virtualization. This option allows OpenPOWER partners to deploy new hardware features and servers while maintaining a stable environment for guest operating systems.

While IBM remains committed to each respective upstream community, this new community will help all to advance the OpenPOWER ecosystem and ensure some feature consistency. Stay tuned for version 1.0, which will bring additional stability and more Linux enablement for OpenPOWER innovations, such as new processor features, as well as advancements on virtualization technology.

To get started, more information is available at the OpenPower HostOS Github portal:

The full collection of components can be found here:

Why the OpenPOWER Developer Challenge is Important to Kinetica

By Amit Vij, CEO, Kinetica


At Kinetica (formerly GPUdb), we have experienced first-hand how the massive hardware acceleration improvements made possible by OpenPOWER can have truly transformational benefits for enterprises. We are in the business of helping customers uncover new business insights in real-time from massively growing volumes of data, often spanning IoT and other streaming data sources. We simply couldn’t solve our customer’s problems with traditional data technologies. OpenPOWER not only makes it possible to deliver massive data processing performance gains at a fraction of the cost, it also allows our customers to tackle brand new challenges and to make the world a better place for us all. Continue reading

Develop Exciting Cognitive Applications in the OpenPOWER Developer Challenge

By Mike Gschwind, Chief Engineer, Machine Learning and Deep Learning, IBM

Cognitive Applications have transformed the face of computing and how humans interact with computers. Some examples are driver-assistive technologies for enhanced road safety, personalized assistants like Siri and Google Now for improved productivity; and enhanced public security through advanced threat detection. Reflecting the increasing importance of cognitive applications, when we launched the OpenPOWER Developer Challenge earlier this month we included a competition around developing cognitive applications: the Cognitive Cup! Continue reading

With OpenPOWER, Unicamp Shares Academic Research Across Brazil

By Juliana Rodrigues, Student, Unicamp

(This post appears as part of our Developer Series. To learn more about what developers are doing with OpenPOWER, visit the OpenPOWER Developer Challenge)

I was about four years old when I got my first computer, but it wasn’t until I was 13 that I had my first experience with Linux. I didn’t have a CD drive at the time, so I did a Debian Etch net-install on a dial-up connection. It only took about four hours until I lost my connection and had to start over. After a few more hours and a lot of work, when I saw the login screen I felt like I was diving into a new world. Continue reading

E4 Computer Engineering Showcases Full Line of OpenPOWER Hardware at International Supercomputing

By Ludovica Delpiano, E4 Computing

E4’s mission, to drive innovation by implementing and integrating cutting-edge solutions with the best performance for every high-end computing and storage requirement, is very much our focus for this year’s edition of ISC. We chose to showcase a number of systems at our booth, #914, based on one of the most advanced technologies available at the moment: accelerated POWER8 technology. Continue reading

New Whitepaper: HPC and HPDA for the Cognitive Journey with OpenPOWER

By Dr. Srini Chari, Managing Partner, Cabot Partners

5I’m pleased to announce the publication of Cabot Partners’ new Whitepaper, HPC and HPDA for the Cognitive Journey with OpenPOWER.  An update to last year’s Crossing the Performance CHASM with OpenPOWER, our latest analysis captures the progress and continuing momentum of the OpenPOWER Foundation and the evolution of its accelerated computing roadmap. Continue reading

Diversify Cloud Computing Services on OpenPOWER with NEC’s Resource Disaggregated Platform for POWER8 and GPUs

By Takashi Yoshikawa and Shinji Abe, NEC Corporation

The Resource Disaggregated (RD) Platform expands the use of cloud data centers in not only office applications, but also high performance computing (HPC) with the ability to simultaneously handle multiple demands for data storage, networks, and numerical/graphics processes. The RD platform performs computation by allocating devices from a resource pool at the device level to scale up individual performance and functionality. Continue reading

eASIC Brings Advanced FPGA Technology to OpenPOWER

By Anil Godbole, Senior Marketing Manager, eASIC Corp. 

easic logo
eASIC is very excited to join the OpenPOWER Foundation. One of the biggest value propositions of the eASIC Platform is to offer an FPGA design flow combined with ASIC-like performance and up to 80% lower power consumption. This allows the community to enable custom designed co-processor and accelerator solutions in datacenter applications such as searching, pattern-matching, signal and image processing, data analytics, video/image recognition, etc.

Need for Power-efficient CPU Accelerators

The advent of multi-core CPUs/GPUs has helped to increase the performance of modern datacenters. However, this performance is being limited by a non-proportional increase in energy consumption. As workloads like Big Data analytics and Deep Neural Networks continue to evolve in size, there is a need for new computing paradigm which will continue scaling compute performance while keeping power consumption low.

A key technique is to exploit parallelism during program execution. While multi-core processors can also execute in parallel, they burn a lot of energy when sharing data/messages between processors. That is because such data typically resides in off-chip RAMs and their accesses are very power hungry.

eASIC Platform

The eASIC Platform uses distributed logic blocks with associated local memories which enable highly parallel and power efficient implementations of the most complex algorithms. With up to twice the performance of FPGAs and up to 80% lower power consumption the eASIC Platform can provide a highly efficient performance per watt for the most demanding algorithm.  The vast amount of storage provided by the local memories allows fast message and data transfers between the compute elements reducing latency and without incurring the power penalty of accessing off-chip RAM.

CAPI Enhancements

CAPI defines a communication protocol for command/data transfers between the main processor and the accelerator device based on shared, coherent memory. Compared to traditional I/O- based protocols, CAPI’s approach precludes the need for O/S calls thereby significantly reducing the latency of program execution.

Combining the benefits of eASIC Platform and CAPI protocol can lead to high performance and power-efficient Co-processor/Accelerator solutions. For more details on the eASIC Platform please feel free to contact us or follow us on Twitter @eASIC.

Managing Reconfigurable FPGA Acceleration in a POWER8-based Cloud with FAbRIC

By Xiaoyu Ma, PhD Candidate, University of Texas at Austin

This post is the first in a series profiling the work developers are doing on the OpenPOWER platform. We will be posting more from OpenPOWER developers as we continue our OpenPOWER Developer Challenge


FPGAs (Field-Programmable Gate Array) are becoming prevalent. Top hardware and software vendors have started making it a standard to incorporate FPGAs into their compute platforms for performance and power benefits. IBM POWER8 delivers CAPI (Coherent Accelerator Processor Interface) to enable FPGA devices to be coherently attached on the PCIe bus. Industries from banking and finance, retail, healthcare and many other fields are exploring the benefits of FPGA-based acceleration on the OpenPOWER platform. Continue reading

OpenPOWER Ready™ Solutions Expand Growing OpenPOWER Ecosystem

By Jeff Brown, ‎Distinguished Engineer, Emerging Product Development at IBM


Continuing the OpenPOWER Foundation’s momentum, we’ve launched the OpenPOWER Ready™ program at the OpenPOWER Summit this week in San Jose. This program empowers both members and non-members to embrace and promote their OpenPOWER technology. This designation will strengthen our ecosystem of products and solutions built upon IBM’s POWER architecture, creating additional confidence for developers, builders and customers that use OpenPOWER Ready hardware and software.

OpenPOWER Ready was designed to indicate that a product or solution has met a minimum set of criteria set forth by the Foundation. The OpenPOWER Ready definition and criteria was a collaboration of several of the Foundation’s work groups and will evolve over time under the direction of a new OpenPOWER Ready work group. Part of this criteria centers around whether a product or solution is interoperable with other OpenPOWER Ready products, reinforcing the collaborative nature of the OpenPOWER ecosystem. Both OpenPOWER members and non-members can apply for the mark, which can be designated for both qualifying hardware and software. We’ve outlined the full set of OpenPOWER Ready criteria on the OpenPOWER Foundation website.

We are excited to continue to transform the data center with the OpenPOWER Ready journey. In addition to increasing confidence in existing members’ OpenPOWER-based products, we hope to inspire non-members with OpenPOWER Ready innovations to join the OpenPOWER Foundation and further grow our collaborative, open ecosystem. It is our vision that companies and other entities utilizing this mark will further solidify OpenPOWER technology as a superior alternative to other server solutions.

To see the first set of products designated OpenPOWER Ready, visit the OpenPOWER Ready homepage.

OpenPOWER Foundation Revolutionizes the Data Center at Summit 2016!

By John Zannos, Chairman, OpenPOWER Foundation


As we reach the pinnacle of our second OpenPOWER Summit in San Jose, I want to take a minute to recognize all of our members who have contributed to the momentum and growth we’ve seen since we gathered here last year for our inaugural event.

I also want to thank the members of the OpenPOWER Foundation Board for electing me as the new Chairman and electing Calista Redmond of IBM as President. Calista and I follow the success of the Foundation’s former Chair and founding OpenPOWER member, Gordon MacKean of Google, and former President and founding OpenPOWER member, Brad McCredie of IBM. My thanks to Gordon and Brad.

I’m happy to say that our membership has grown and surpassed the two hundred mark. It’s not just our membership that’s expanding, though – it’s the entire OpenPOWER ecosystem. We’re seeing more hardware and software innovations being developed and launched into the market, OpenPOWER work groups are building the guidelines that drive innovation, and there’s a growing number of developers working on OpenPOWER.

It’s clear to see that companies around the world are interested in collaborating to create innovative products and solutions that meet the needs of the modern data center. The market continues to ask for technology choice and openness. OpenPOWER is supplying collaborative innovation by pulling together a community that is working and innovating together.

Today at Summit, our members announced more than 50 new OpenPOWER-based innovations, many of which were developed in collaboration with fellow Foundation members. The new innovations showcase the Foundation’s commitment to CAPI accelerator technology and building new solutions for high performance computing and cloud deployments. These are real examples of the deep innovation that results from open collaboration. The full list of member solutions is impressive, as you can see by checking out our OpenPOWER fact sheet.

We’re not just focused on developing new solutions. We also remain committed to our OpenPOWER developer ecosystem. Today, we introduced the OpenPOWER Ready™ seal, enabling companies to validate their hardware and software solutions against self-test guidelines from the Foundation. We hope that OpenPOWER Ready will help grow our ecosystem, providing added confidence for developers, builders and customers.

We also announced the first-ever OpenPOWER Developer Challenge to encourage developers to tap the power of open and show us what they can create.

There are several exciting things planned at OpenPOWER Summit over the next few days. You can find the full schedule of OpenPOWER Summit events on our website. If you’re onsite, we invite you to stop by the OpenPOWER Pavilion. And don’t forget about the renowned OpenPOWER Ice Bar tonight from 5-7 p.m. PT – it’s a fan favorite.

Thank you, and please stop by and tell us what you are thinking.

Announcing the OpenPOWER Developer Challenge: Tap the Power of Open

By Randall Ross, Ubuntu Community Manager, Canonical

One thing that unites my work at Canonical as an Ubuntu Community Manager with my work for the OpenPOWER Foundation is both organizations’ clear and unrelenting passion for developers. They both know that developers are the true musicians when it comes to making OpenPOWER “sing”. As OpenPOWER member GPUdb said, “We’re making the instrument but they [developers] are making the song.”

M1 GPUdb

Without developers, hardware is like a high-performance exotic car sitting on a dealer’s lot. We have the technology, but we need someone to drive that car to the Autobahn and “floor it!” (Having a relaxed speed limit helps.)

We know that our OpenPOWER community has plenty of drivers waiting for the opportunity to show what they can do. You may have noticed several developer-focused activities and news items coming from OpenPOWER over the past few weeks. That’s no coincidence. It’s because we’ve been ramping up to share some very exciting news: we are pleased to announce the first ever OpenPOWER Developer Challenge!

Tap into performance tile

Show us what you can do with OpenPOWER technology and you could win a whole range of prizes, from Apple Watches to an all-expenses paid trip to Supercomputing 2016 to showcase your work in front of developers and IT leaders from around the world! Just go to to register.

The OpenPOWER Developer Challenge allows you to participate in two ways:

  • Port and optimize your code in the Open Road Test, and use accelerators to go even faster
  • Join the Spark Rally to train an accelerated deep neural network to recognize objects with greater activity, then show us how you can scale with Apache Spark

There is no limit to the number of entries you submit, so long as they are their own unique applications!

The submission period will open on May 1, and closes on August 2, so start forming teams and thinking of project ideas now!

To get started, let’s take a tour of the Supervessel virtual environment that you will be using to build your application.

Go Global with OpenPOWER Developer Resources

By Sam Ponedal, Social Strategist for OpenPOWER, IBM

OpenPOWER Developer Map Social Tile

There are many faces that make up the OpenPOWER Foundation and its ecosystem. We have hardware manufacturers who provide the cutting edge technology that serves as the hardware platform for OpenPOWER. We have MSPs that install and leverage OpenPOWER technology for their customers, and we have researchers and universities who are applying OpenPOWER technology to solve global problems. But perhaps the most important individuals working in the OpenPOWER ecosystem are developers. Currently, there are over 1,000 ISVs who have built applications for OpenPOWER, and we know that in order for our ecosystem to grow, we need to keep making OpenPOWER the most accessible open platform for developers, with the performance capabilities to make the most blazing fast applications to boot. With features like CAPI and GPU acceleration, OpenPOWER provides developers with the performance to truly make their applications sing. In addition, OpenPOWER uses the same familiar tools like Linux, CUDA, and big and little endians so that developers can apply the skills they already have to building new applications.

But how do developers access OpenPOWER hardware for testing and development? To answer that important question, today we are pleased to announce the OpenPOWER Developer Resources Map, available at This interactive and free tool can help developers locate in-person and virtual development resources that best suit their unique needs and goals for their project. Interested in getting hands-on in-person with the POWER8 chip architecture? Visit one of our members’ open developer facilities in your local area. Want to go virtual and explore how CAPI can accelerate your application on the OpenPOWER platform? Leverage Supervessel or our other CAPI-enabled developer clouds. Developers simply log-in, select the filters that best suit their project, and then pinpoint the best resource for them.

openpower developer map

This new tool complements our existing library of developer tools and information, which features:

  • Development Environments and VMs
  • Development Systems
  • Technical Specifications
  • Software
  • Developer Tools

To learn more about the OpenPOWER tools available to developers, visit our Technical Resources page.

Developers are the key to the growth of the OpenPOWER ecosystem, and with the greatest minds in the world working on building cutting edge, high-performance on OpenPOWER, the world’s only truly open hardware architecture, we’re excited about the possibilities. If you’re a developer and looking for how you can get more involved with OpenPOWER, stay tuned for more as we’re going to be announcing some exciting developer-focused initiatives in the coming months.

Have questions or want to know more about what we offer? Let us know in the comments below! Happy coding!

Singapore’s A*CRC Joins the OpenPOWER Foundation to Accelerate HPC Research

By Ganesan Narayanasamy, Senior Manager, IBM Systems

Singapore’s Agency for Science, Technology and Research (A*STAR) is the largest government funded research organization in Singapore, with over 5,300 personnel in 14 research institutes across the country.

A STAR Computational Resource Centre

A*STAR Computational Resource Centre (A*CRC) provides high performance computational (HPC) resources to the entire A*STAR research community. Currently A*CRC supports HPC needs of an 800 member user community and manages several high-end computers, including an IBM 822LC system with NVIDIA K80 GPU Cards and Mellanox EDR switch to port and optimize the HPC applications. It is also responsible for very rapidly growing data storage resources.

A*CRC will work with IBM and the OpenPOWER Foundation to hasten its path to develop applications on OpenPOWER Systems leveraging the Foundation’s ecosystem of technology.

Experts it A*CRC will explore the range of scientific applications that leverage the Power architecture as well as NVIDIA’s GPU and Mellanox’s 100 GB/sec Infiniband switches. The switches are designed to work with IBM’s Coherent Application Processor Interface (CAPI), an OpenPOWER technology that allows attached accelerators to connect with the Power chip at a deep level.

A*CRC also will work with the OpenPOWER Foundation on evolving programming models such as OpenMP, the open multiprocessing API designed to support multi-platform shared memory.

“We need to anticipate the rise of new high performance computing architectures that bring us closer to exascale and prepare our communities,” A*CRC CEO Marek Michalewicz noted in a statement.


This week, A*STAR is hosting the Singapore Supercomputing Frontiers Conference. To learn more about their work, take part in our OpenPOWER workshop on March 18 and stay tuned for additional updates.

OpenPOWER Members at Open Compute Summit Detail Their Barreleye Plans

By Sam Ponedal, Social Strategistbarreleye fish

Last year at Open Compute Summit,OpenPOWER member Rackspace stole the show when they announced their plans to develop Barreleye, their new mega-server built with open standards across the board. In the year since, OpenPOWER members have jumped on the Barreleye bandwagon, and it’s easy to see why when Barreleye was described by Rackspace’s Aaron Sullivan as having “the capacity for phenomenal virtual machine, container, and bare metal compute services.” Continue reading

India’s Centre for Development of Advanced Computing Joins OpenPOWER to Spread HPC Education

By Dr. VCV Rao and Mr. Sanjay Wandheker 


An open ecosystem relies on collaboration to thrive, and at the Centre for Advanced Computing (C-DAC), we fully embrace that belief.

C-DAC is a pioneer in several advanced areas of IT and electronics, and has always been a proactive supporter of technology innovation. It is currently engaged in several national ICT (Information and Communication Technology) projects of critical value to the Indian and global nations, and C-DAC’s thrust on technology innovation has led to the creation of an ecosystem for the coexistence of multiple technologies, today, on a single platform.

Driving National Technology Projects

Within this ecosystem, C-DAC is working on strengthening national technological capabilities in the context of global developments around advanced technologies like high performance computing and grid computing, multilingual computing, software technologies, professional electronics, cybersecurity and cyber forensics, and health informatics.

C-DAC is also focused on technology education and training, and offers several degree programs including our HPC-focused C-DAC Certified HPC Professional Certification Programme (CCHPCP). We also provide advanced computing diploma programs through the Advanced Computing Training Schools (ACTS) located all over India.

One of C-DAC’s critical projects is the “National Supercomputing Mission (NSM): Building Capacity and Capability“, the goal of which is to create a shared environment of the advancements in information technology and computing which impact the way people lead their lives.

Partnering with OpenPOWER


The OpenPOWER Foundation makes for an excellent partner in this effort, and through our collaboration, we hope to further strengthen supercomputing access and education by leveraging the OpenPOWER Foundation’s growing ecosystem and technology. And with OpenPOWER, we will develop and refine HPC coursework and study materials to skill the next generation of HPC programmers on OpenPOWER platforms with GPU accelerators.

In addition, CDAC is eager to explore the potential of OpenPOWER hardware and software in addressing some of our toughest challenges. OpenPOWER offers specific technology features in HPC research which include IBM XLF Compliers, ESSL libraries, hierarchical memory features with good memory bandwidth per socket, IO bandwidth, the CAPI interfaces with performance gain over PCIe and the potential of POWER8/9 with NVIDIA GPUs. These OpenPOWER innovations will provide an opportunity to understand performance gains for a variety of applications in HPC and Big Data.

Come Join Us

We’re very eager to move forward, focusing on exposure to new HPC tools on OpenPOWER-driven systems. CDAC plans to be an active member of the OpenPOWER community by making HPC software for science and engineering applications an open source implementation available on OpenPOWER systems with GPU acceleration.

To learn more about CDAC and to get involved in our work with OpenPOWER, visit us online at If you would like to learn more about our educational offerings and coursework, go to

New OpenPOWER Member DRC Computing Discusses FPGAs at IBM Interconnect

By Roy Graham, President and COO, DRC Computer Corp.

New business models bring new opportunities, and my relationship with IBM is proof-positive of that fact. Although I respected them, in the previous way of doing business they were the competition, and it was us or them. Wow, has that changed! In the last year working with IBM I see a very new company and the OpenPOWER organization as a real embodiment of a company wanting to partner and foster complementary technologies.


DRC Computer (DRC) builds highly accelerated, low latency applications using FPGAs (Field Programmable Gate Arrays). These chips offer massive parallelism at very low power consumption. By building applications that exploit this parallelism we can achieve acceleration factors of 30 to 100+ times the equivalent software version. We have built many diverse applications in biometrics, DNA familial search, data security, petascale indexing, and others. At Interconnect 2016 I’ll be highlighting two applications – massive graph network analytics and fuzzy logic based text/data analysis. More details on some of the DRC applications can be found at here.

We are working closely with the CAPI group at IBM to integrate the DRC FPGA-based solutions into Power systems. One of the early results of this cooperation was a demonstration of the DRC graph network analytics at SC15 running on a POWER8 system using a Xilinx FPGA.

OpenPOWER provides DRC with a large and rapidly expanding ecosystem that can help us build better solutions faster and offer partnerships that will vastly expand our market reach. The benefit for our customers will be a more fully integrated solution and improved application economics. In Session 6395 on Feb 23rd at 4:00pm PT I will be presenting this work with FPGAs at IBM’s InterConnect Conference in Las Vegas as part of a four-person panel discussing OpenPOWER.

In the session, I’ll cover the DRC graph networking analytics and fuzzy logic based text/data analysis. The graph networking system implements Dijkstra and Betweenness Centrality algorithms to discover and rank relationships between millions of people, places, events, objects, and more. This achieves in excess of 100x acceleration compared to a software-only version. As a least-cost path and centrality analysis, it has broad applicability in many areas including social networks analysis, distribution route planning, aircraft design, epidemiology, stock trading, etc. The fuzzy logic based text/data analytics was designed for social media analysis, and captures common social media misspellings, shorthand, and mixed language usage. The DRC product is tolerant of these and enables an analyst to do a score based approximate match on phrases or words they are searching for. We can search on hundreds of strings simultaneously on one FPGA achieving acceleration factors of 100x software applications.

OpenPOWER is opening up whole new uses for FPGAs, and through the collaborative ecosystem, the greatest minds in the industry are working on unlocking the power of accelerators. In an era where performance of systems come not just from the chip, but across the entire system stack, OpenPOWER’s new business model is the key to driving innovation and transforming businesses. Please join me at session 6395 on Feb 23rd at 4:00pm PT, and I look forward to collaborating with you and our fellow members in the OpenPOWER ecosystem.

Roy Graham is the President and COO of DRC Computing Corp. and builds profitable revenue streams for emerging technologies including data analytics, communications, servers, human identification systems and hybrid applications. At Digital and Tandem Roy ran Product Management groups delivering > $10B in new revenue. Then he was SVP S&M at Wyse ($250M turnaround), and at Be (IPO) and CEO at 2 early stage web-based companies.


New OpenPOWER Member Brocade Showcases Work at Mobile World Congress

By Brian Larsen, Director, Partner Business Development, Brocadelogo-brocade-black-red-rgb

In my 32 year career in the IT industry there has never been a better time to embrace the partnership needed to meet client requirements, needs and expectations.  Brocade has built its business on partnering with suppliers who deliver enterprise class infrastructure in all the major markets. This collaborative mindset is what led us to the OpenPOWER Foundation, where an eco-system of over 180 vendors, suppliers, and researchers can build new options for client solutions.

Brocade recognizes that OpenPOWER platforms are providing choice and with that choice comes the need to enable those platforms with the same networking capabilities that users are familiar with.  If you have been in a cave for the last eight years, you may not know that Brocade has broken out of its mold of being a fibre channel switch vendor and now supports a portfolio of IP networking platforms along with innovative solutions in Software Defined Networking (SDN) and Network Function Virtualization (NFV). Our work will allow our OpenPOWER partners to design end to end solutions that include both storage and IP networked solutions.  Use cases for specific industries can be developed for high-speed network infrastructure for M2M communication or compute to storage requirements.  As target use cases evolve, networking functionality could transform from a physical infrastructure to a virtual architecture where the compute platform is a critical & necessary component.

OpenPOWER Venn Diagram

The OpenPOWER Foundation’s membership has exploded since its inception and is clearly making a mark on new data center options for users who expect peak performance to meet today’s demanding IT needs.  As Brocade’s SVP and GM of Software Networking, Kelly Herrell says, “OpenPOWER processors provide innovation that powers datacenter and cloud workloads”.  Enterprise Datacenters and Service Providers (SP) markets are key areas of focus for Brocade and by delivering on its promise of the “New IP” businesses will be able to transition to more automation, accelerated service delivery and new revenue opportunities.

Brocade will be at Mobile World Congress in Barcelona and IBM’s InterConnect Conference in Las Vegas from February 22-25th, come see us and let us show you the advantages of being an eco-system partner with us.

Brian Larsen BrocadeBrian Larsen joined Brocade in July 1991 and has more than 29 years of professional experience in high-end processing, storage, disaster recovery, Cloud, virtualization and networking environments. Larsen is the Director of Partner Business Development Executive responsible for solution and business development within all IBM divisions. For the last 5 years, he has focused on both service provider and enterprise markets with specific focus areas in: Cloud, Virtualization, Software Defined Networking (SDN), Network Function Virtualization (NFV), Software Defined Storage (SDS) and Analytics solutions.

OpenPOWER Summit 2016: Vive la Révolution!

By Calista Redmond, President, OpenPOWER Foundation

OpenPOWER_Summit2016_logo2OpenPOWER Summit is coming up April 5-8 in San Jose and we want you there! In fact, we’re making it easier than ever for you to attend by offering 20% off your registration fee by using our discount code. Just input OPFSUMMIT2016 during checkout to get up to $300 off the cost of admission!

Can’t join us in-person? Be sure to follow us on Twitter at @OpenPOWERorg and use the hashtag #OpenPOWERSummit to get the latest from our on-the-ground social reporters and join the conversation!

The OpenPOWER Foundation’s model of open collaboration between organizations has flipped the script and spawned an incredibly engaged community. We’re not adjusting the dial, we’re leveling the playing field and changing the game for both open software and hardware. This is a revolution for our industry.

Hear from vendors like Mellanox, NVIDIA, Tyan, Nallatech, and IBM on their latest hardware innovations. See how MSPs like Rackspace, Arrow ECS, and Redis Labs are bringing OpenPOWER into the cloud. Get hands on with Canonical and Ubuntu to experience how OpenPOWER is built upon the leading open source operating system, Linux, and how we’re embracing and practicing the ideals of open source.

We invite you to join us at the OpenPOWER Summit where you can:

  • see over 50 presenters and speakers share their OpenPOWER-driven innovation including talks from end users as well as hardware and software innovators,
  • visit a show floor of demos to understand OpenPOWER innovations in action,
  • network with your peers at the OpenPOWER Pavilion Theater to embrace the open spirit of collaboration,
  • join  our ISV Roundtable to hear from cross-industry leaders about how OpenPOWER is accelerating their business,
  • get hands on with CAPI and learn from OpenPOWER’s brightest engineers during our CAPI Lab, and
  • more workshops and speakers to be announced in the coming weeks!

And of course, grab a drink from our famous OpenPOWER Ice Bar!

Ice bar Pic

To attend the Summit, register here using the OpenPOWER Member 20% off discount code OPFSUMMIT2016.

And be sure to follow us on Twitter, Facebook, LinkedIn, and Google+ to stay up to date with the latest news and use the hashtag #OpenPOWERSummit to join the conversation.

Vive la Révolution!

Announcing a New Era of Openness with Power 3.0

By Michael Gschwind, Chief Architect & Senior Manager, Power System Architecture, IBM 

I am excited to announce the availability of the next generation of the Power Architecture, ushering in a new era for systems. The new Power Instruction Set Architecture 3.0 (Power ISA 3.0) marks the first generation of architecture developed and released since the creation of the OpenPOWER Foundation, building upon and sustaining the growth of the Foundation’s open ecosystem of collaborative innovation.


The Power ISA 3.0 architecture reflects the values of our open ecosystem, enhancing the platform by continuing the evolution of the RISC ISA concepts pioneered by the Power Architecture to deliver high-performance scalable systems optimized around workload needs. The new architecture specification include enhancements such as:

  • Improved support for string and memory block operations with the vector string facility
  • Expanded little-endian support
  • Instruction fusion and PC-relative addressing in support of improved application portability
  • Hardware garbage collection acceleration
  • Enhanced in-memory database support
  • Interrupt and system call enhancements
  • Hardware support for the native Linux radix page table format

These new updates mean that the most important operations of a broad range of workloads will benefit from targeted optimizations to accelerate them even as speedups from semiconductor technology improvements can no longer be taken for granted.

Power ISA 3.0 supports the entire spectrum of application choices with a common architecture definition. This ensures the OpenPOWER ecosystem enjoys the same level of compatibility that IBM enterprise customers have enjoyed over the past three decades. Consequently, Power ISA 3.0 no longer has optional categories, or separate server and embedded ISA architecture options, as the new specification supports the entire range of implementations. This allows for simpler sharing of application and system software across the entire range of Power processor implementations, enabling software developers to more easily support a broader range of applications, and ensuring that OpenPOWER compliant applications truly support a “write once, run everywhere” application development model.

In addition, the new Power ISA 3.0 enables architects to build on a solid base and protect today’s investments in Power-based software and solutions, by maintaining compatibility for applications developed in previous architecture generations.  Consequently, programs going back to the beginning of POWER remain compatible with the new Instruction Set Architecture defined by Power ISA 3.0.

To learn more about the Power Instruction Set Architecture, read the full description at, or join me at the OpenPOWER Summit 2016 to discuss this and other new developments from the OpenPOWER Foundation.

mkgMichael Gschwind is a Chief Architect for Power Systems and the Chief Engineer for Machine Learning and Deep Learning in IBM’s Systems Group.   He was also a Chief Architect responsible for creating the little-endian Power software environment which forms the foundation of the OpenPOWER ecosystem and the software environment for the Cell SPE, the first general purpose programmable accelerator. Dr. Gschwind is an IBM Master Inventor, a member of the IBM Academy and a Fellow of the IEEE.

Workshop Recap: OpenPOWER Academic Community Shares Latest Advances

By Ganesan Narayanasamy (OpenPOWER Academic Discussion Group Leader)

Before SC15 would kick off on November 15th, about 40 of the 170+ members of the OpenPOWER Foundation were already gathered in Austin to discuss OpenPOWER technologies for High Performance Computing (HPC) and High Performance Data Analytics applications. Ready to share and discuss their work in emerging workload optimization for OpenPOWER, members of the OpenPOWER Academic Discussion Group (ADG) took an opportunity to network, share knowledge and explore commercial collaboration opportunities at the ADG’s first annual meeting.  Presentations from the meeting are available for download here.

Participants included representatives *from NVIDIA, Jülich Supercomputing Centre, Delft University of Technology, Texas Advanced Computing Center, Oak Ridge National Laboratory, A*STAR Computational Research Centre, the UK’s STFC Hartree Centre, and IBM (Apache Spark on Power, POWER8, Genomics on Power).

*direct links to presentations


Prof. Dirk Pleiter, Jülich Supercomputing Centre

From getting the most out of the POWER processor’s advanced features and GPU acceleration to using new Big Data frameworks like Apache Spark along with FPGA acceleration, workshop participants shared their early work in exploiting OpenPOWER to deploy advanced system architectures capable of breakthrough performance when running the most challenging computing workloads.


Dr. Jack Wells, Oak Ridge National Laboratory Director of Science, Oak Ridge Leadership Computing Facility

Coming together across international, disciplinary and academic boundaries, OpenPOWER ADG members united around common interests including fundamental computing challenges in HPC (parallelization, latency, bandwidth, job throughput) and use of the latest community and commercially supported software. It was a first step in launching formal collaboration around OpenPOWER across the Academic Community, who are among the first to address the newest, hardest problems in science and scientific computing.

groupAs we begin the New Year I look forward to seeing OpenPOWER ADG members continue to shape and apply OpenPOWER technology to push the boundaries of computing ever further, together.

Learn More

Join Us!

  • If interested in joining the OpenPOWER Academic Discussion Group, please email Ganesan Narayanasamy
  • Visit the ADG at the OpenPOWER Summit 2016 (check back here for details coming soon)
  • Visit the ADG at ISC 2016 in Frankfurt (check back here for details)
  • Join the 2nd annual ADG Workshop at SC 2016 in Salt Lake City (check back here for details)

About Author

Ganesan Narayanasamy is a Senior Manager with IBM Systems Lab and brings 15 years experience in High Performance Computing R&D and technical leadership to his many activities within the OpenPOWER Foundation including leadership of the Foundation’s Academic Discussion Working Group.  He’s passionate about working with universities and research institutes with whom he’s currently working to help develop curriculum, labs, and centers of excellence around OpenPOWER technology.

Continuing the Datacenter Revolution

By John Zannos and Calista Redmond

OPF logoDear Fellow Innovators,

As the newly elected Chair and President of the OpenPOWER Foundation, we would like to take this opportunity to share our vision as we embark on a busy 2016.  Additionally, we want to make sure our fellow members — all 175 of us and growing — are aware of the many opportunities we have to contribute to our vibrant and growing organization.

Our Vision

First, the vision.  Through an active group of leading technologists, OpenPOWER in its first two formative years built a strong technical foundation — developing the literal bedrock of hardware and software building blocks required to enable end users to take advantage of POWER’s open architecture.  With several jointly developed OpenPOWER-based servers already in market, a growing network of physical and cloud-based test servers and a wide range of other resources and tools now available to developers around the world, we have a strong technical base.  We are now moving into our next phase: scaling the OpenPOWER ecosystem.  How will we do this?  With an unwavering commitment to optimize as many workloads on the POWER architecture as possible.

It is in this vein that we have identified our top three priorities for 2016:

  1. Tackle system bottlenecks through collaboration on memory bandwidth, acceleration, and interconnect advances.
  2. Grow workloads and software community optimizing on OpenPOWER.
  3. Further OpenPOWER’s validation through adoption conveyed via member and end user testimonials, benchmarking, and industry influencer reports.

As employees of Canonical and IBM, and active participants in OpenPOWER activities stemming back to the early days, we share a deep commitment to open ecosystems as a driver for meaningful innovation.  Combining Canonical’s leadership with growing software applications on the POWER architecture with IBM’s base commitment to open development on top of the POWER architecture at all levels of the stack, we stand ready to help lead an even more rapid expansion of the OpenPOWER ecosystem in 2016.  This commitment, however, extends well beyond Canonical and IBM to across the entire Board leadership, which continues to reflect the diversity of our membership.  Two of the original founders of OpenPOWER — our outgoing chair Gordon MacKean of Google and president Brad McCredie with IBM — will remain close and serve as non-voting Board Advisors, providing guidance on a wide range of technical and strategic activities as needed. To read Gordon MacKean’s perspective on OpenPOWER’s growth, we encourage you to read his personal Google+ post.

In driving OpenPOWER’s vision forward, we are fortunate to have at our disposal not just our formal leadership team, but a deep bench of talent throughout the entire organization – you – literally dozens of the world’s leading technologists representing all levels of the technology stack across the globe. With your support behind us as, we’re sure the odds are stacked in our favor and we can’t wait to get started.

Get Involved

So, now that you’ve heard our vision for 2016, how can you get involved?


  • Make the most out of the 2016 OpenPOWER Summit – Register to attend, exhibit, submit a poster or present at this year’s North American OpenPOWER Summit in San Jose April 5-7. And, think about what OpenPOWER-related news you can reveal at the show.  We are expecting 200+ press and analysts to attend, so this an opportunity for Members to get some attention.  Be on the lookout for a “Call for News” email soon.  Click here to register and get more details.  Specific questions can be directed to the Summit Steering Committee at
  • Contribute your technical expertise – Share your technical abilities and drive innovation with fellow technology industry leaders through any of the established Technical Work Groups. Contact Technical Steering Committee Chair Jeff Brown at to learn more or to join a work group.
  • Shape market perceptions – Share your marketing expertise and excitement for the OpenPOWER Foundation by joining the marketing committee. Email the marketing committee at to join the committee or learn more.
  • Join the Academic Discussion Group – Participate in webinars, workshops, contests, and collaboration activities. Email Ganesan Narayanasamy at to join the group or learn more.
  • Link up with geographic interests – European member organizer is Amanda Quartly at The Asia Pacific member organizer is Calista Redmond at
  • Tap into technical resources – Use and build on the technical resources, cloud environments, and loaner systems available. Review what technical resources and tools are now available and the growing network of physical and cloud-based test servers available worldwide.
  • Engage OpenPOWER in industry events and forums – Contact Joni Sterlacci at if you know of an event which may be appropriate for OpenPOWER to have an official presence.
  • Share your stories – Send your end-user success stories, benchmarks, and product announcements to OpenPOWER marketing committee member Greg Phillips at
  • Write a blog – Submit a blog to be published on the OpenPOWER Foundation blog detailing how you’re innovating with OpenPOWER. Send details to OpenPOWER Foundation blog editor Sam Ponedal at
  • Join the online discussion – Follow and join the OpenPOWER social conversations on Twitter, Facebook, LinkedIn and Google+.

And, finally, please do not hesitate to reach out to either of us personally to discuss anything OpenPOWER-related at any time.  Seriously.  We’d love to hear from you!

Yours in collaboration,

John Zannos                                                     Calista Redmond
OpenPOWER Chair                                           OpenPOWER President                    

OpenPOWER: The Rebel Alliance of the Industry

By Sam Ponedal, Social Strategist for OpenPOWER


Episode 8: OpenPOWER


A dark empire has spread across the Compute Galaxy. Driven by a zealous belief in an antiquated law, the empire seeks to place the universe’s IT practitioners under their repressive rule.

But a new force is rising. Seeking to define a new approach to hardware based on Open Acceleration and collaboration, the OpenPOWER Foundation’s 170+ member ecosystem is challenging the empire.

Driven by open innovation, the OpenPOWER Foundation is achieving new levels of performance…

(To see this crawl as it’s meant to be viewed, click here to

OK, I had to get that out of my system. If you’re like me, this is a week that you’ve been waiting on for a decade: the release of the next Star Wars film, The Force Awakens. Everyone is talking about it, arguably it’s one of the most anticipated events in cinema history.

As if I didn’t have enough to be excited about, at Supercomputing 2015 in Austin, TX last month, analyst Dan Olds coined OpenPOWER the “Rebel Alliance of the industry,” and I couldn’t agree more. Like a Wookie in a China shop this thought was bursting to get out, so I wrote it all down and examined the ways that OpenPOWER is like the Rebel Alliance.

First off what’s one of the most iconic symbols of the Rebel Alliance? That’s right, the X-Wing Fighter.

It’s versatile, powerful, and always gets the job done. When the Rebels need to take down the Death Star, they call on a squadron of X-Wings to target the exhaust port. For OpenPOWER, the X-Wing is the POWER8 processor. Its 4X thread per core improvement over x86 is reminiscent of the X-Wing’s four wings. Couple that with POWER8’s performance benchmarks showing greater than 20% performance over x86 and it’s clear that POWER8 is the workhorse of the OpenPOWER Rebel Alliance. This begs the question: if POWER8 is an X-Wing, what’s x86? That’s an easy one: a TIE fighter, and any Star Wars fan knows what happens when a TIE fighter and an X-Wing go at it.

IBM believes the future of HPC and Enterprise data centers is based on an accelerated data center architecture.  This architecture consists of accelerated computing, accelerated storage, and accelerated networking.   There are new accelerators, storage, and networking devices coming from several technology companies.

Accelerators are all about speed, and if you ever need to make the Kessel run you know that you need the fastest ship in the galaxy: the Millennium Falcon. Accelerator technology takes what the industry considered “fast” and jumps it to lightspeed. Coupled with the POWER8 processor, an accelerator can outrun any task, or Imperial Star Destroyer, thrown at it. Just don’t forget to check the negative power couplings.

But one of OpenPOWER’s strongest assets is its developers, who embody the ideals of open and collaboration, our own Jedi Order. Just as the Jedis are the defenders of truth and justice in the galaxy, so are our developers the custodians of innovation in an open hardware and software ecosystem. And like the Jedi Order, we know that it is important to train and provide tools to the next generation of OpenPOWER Developer so that they can hone their skills within the ecosystem.

That’s why we recently expanded Supervessel, OpenPOWER’s development cloud, to feature new GPU acceleration as a service, deep learning frameworks, and access to cloud-based FPGAs. Add to that our collaborations with the University of Texas’s TACC and Oregon State University’s Open Source Lab to offer free development resources available to anyone worldwide.

But the best part of the Rebel Alliance? That it is open to anyone seeking refuge and asylum from the Empire, and the same is true for OpenPOWER. Our collaborative ecosystem is welcoming to all joiners, and we maintain an open door for people seeking to revolutionize the data center through open hardware and open software. If you would like to get involved in OpenPOWER, read more about the different levels of membership and engagement. And for the latest news be sure to follow us on Twitter, Facebook, and LinkedIn.

Thank you, and may the Open Source be with you.

_Y1O5015Sam Ponedal is an IBM Social Strategist responsible for OpenPOWER’s social presence. He is an avid tech enthusiast, geek, and nerd who uses puns way more than necessary in a professional environment. You can follow him on Twitter to see his latest.

Workshop Recap: OpenPOWER Personalized Medicine Working Group

By Zaid Al-Ars, Cofounder, Bluebee and Chair of the OpenPOWER Foundation Personalized Medicine Working Group

More than 40 participants attended the OpenPOWER Personalized Medicine Workshop in Austin, TX on November 15, 2015.  The workshop gathered leading experts to address computational technology in the field of personalized medicine including challenges, opportunities and future developments.

Separate sessions featured the perspectives of clinical users, technology providers, and HPC researchers, followed by a panel discussion on overall industry challenges and trends.

Session 1: Clinical Users Perspective

Dr. John Zhang (MD Anderson) described the state-of-the-art computational infrastructure at the MD Anderson Cancer Center used for the analysis of the center’s genomics pipelines, followed by a discussion of future challenges in genomics data storage, clinical algorithm adaptation, data mining and data visualization.

Dr. Hans Hofmann (UT Austin) presented a global analytical framework for linking genotype information to phenotype information by addressing the biochemistry, cell biology and physiological aspects of an organism, charting the associated computational and analytical challenges. He noted that for personalized medicine approaches to succeed, we must increase our understanding of the causes and consequences of individual and population variation well beyond current genome-wide association and genotype variation studies.

2015-11-14 10.15.22_2

Session 2: Technology Providers Perspective

Dr. Zaid Al-Ars (Bluebee) presented Bluebee’s platform to address the genome analysis challenge – an accelerated HPC-based private cloud solution to speedup processing of mass volumes of genomics data. The platform provides unrestricted scale-up and on-the-fly provisioning of computational and data storage capacity, along with industry-grade security and data integrity features.  Bluebee’s platform abstracts away the complexity of specialized HPC technologies such as hardware acceleration, offering an easy environment to deploy Bluebee as well as other OpenPOWER genomics technologies.

Dr. Yinhe Cheng (IBM) discussed IBM’s porting and optimization efforts around its high performance infrastructure for genomics, including:

  • BioBuilds a curated and versioned collection of Open Source bioinformatics tools for genomics, delivering 49 pre-built, POWER8 optimized bioinformatics application binaries
  • Broad Best Practices pipeline (BWA/GATK) acceleration on POWER8 demonstrating 2x to 70x analysis speedup of various components of the pipeline – a collaborative effort among IBM, Xilinx and Bluebee
  • Speedup of whole human genome analysis from days to less than half an hour using the Edico Genome solution on Power


Session 3: HPC Researchers Perspective

Dr. Ravishankar Iyer (University of Illinois Urbana-Champaign) presented research projects focused on improving the performance of cancer diagnostics pipelines, including a computational pipeline coded from scratch that executes significantly faster than current state-of-the-art pipelines. He also presented algorithms for health monitoring systems and wearable devices being integrated into a unified personalized medicine platform.

Dr. Jason Cong (UCLA) presented a Spark based approach enabling big data applications on scale-out, hybrid CPU and FPGA cluster architecture. The approach is being used to enable substantial performance increase for genomics computational pipelines such as those used for whole-genome and whole-exome sequencing experiments.

Dr. Wayne Luk (Imperial College London) presented a talk covering reconfigurable acceleration of genomics data processing and compression, demonstrating FPGA accelerated speedup of parts of RNA diagnostics pipelines used to identify cancer. To address large sizes of genomics datasets, his group implemented accelerated compression algorithms to speedup effective storage and management of DNA information. His continuing efforts are focused on optimization and speedup of transMART downstream DNA data analysis on IBM Power platforms.

Challenges and Trends Panel Discussion

Four experts representing various users of genomics information and pipelines participated in a panel moderated by Dr. Peter Hofstee (IBM):

Dr. Webb started the discussion, emphasizing that scientists and research groups working in isolation cannot answer the relevant questions in personalized medicine. Rather, close collaboration among multidisciplinary teams of doctors, geneticists, computer scientists and mathematicians is required to answer difficult questions and develop suitable models and efficient computational methods for use in a clinical environment.

Mr. Greer pointed out that changes are needed to enable effective analysis of personalized medicine information. For example, the lack of unified approaches to documenting and storing patient medical records complicates linking the different sources of information relevant to personalize medical care.

Answering a question from Dr. Hofstee about challenges in the growing field of population sequencing, Dr. Zhang identified the need to help doctors in making actionable decisions based on patient medical information. Dr. Hofmann commented that even common tasks such as data transmission are rapidly becoming a bottleneck due to the staggering sizes of population sequencing information. He further elaborated that standards are needed to ensure security and easy integration between the various genomics data types.

The panel concluded that the community must address computational approaches that consider the inherent variations of the human genome and the different ways these variations play a role in the individual. This will provide doctors with the tools needed to identify levels of confidence associated with a specific therapeutic intervention. Such tools will play an important role in the medical revolution of personalized medicine.

About Zaid Al-Ars

zaidZaid Al-Ars is cofounder of Bluebee, where he leads the development of the Bluebee genomics solutions. Zaid is also an assistant professor at the Computer Engineering Lab of Delft University of Technology, where he leads the research and education activities of the multi/many-core research theme of the lab. Zaid is involved in groundbreaking genomics research projects such as the optimized child cancer diagnostics pipeline with University Medical Center Utrecht and de novo DNA assembly research projects of novel organisms with Leiden University.

Video: What Does “Open” Mean to You?

By OpenPOWER Foundation

Last month the OpenPOWER Foundation was in full force at Supercomputing 2015 in Austin, TX. We had a great time networking with other revolutionaries who are embracing open hardware to revolutionize the data center. We decided to meet with some OpenPOWER members and ask them a simple question: “What does ‘open’ mean to you?” These are their answers.

Read more about how the OpenPOWER Foundation is leading the open hardware revolution

Video: IBM and OpenPOWER Partner with Oak Ridge National Labs to Solve World’s Toughest Challenges

By Jack Wells, PhD, Director of Science, Oak Ridge National Laboratory

The mission of Oak Ridge National Laboratory (ORNL) is to deliver scientific discoveries and technical breakthroughs that will accelerate the development and deployment of solutions in clean energy and global security, and in doing so create economic opportunity for the nation. By partnering with OpenPOWER, we are using the next generation POWER and GPU processor technologies to build Summit, a supercomputer that will have 5x-10x greater performance than today’s leadership systems.

With Summit in place, ORNL will be able to better focus our scientific and technical expertise and apply our leadership-class data and compute infrastructure to solve some of the greatest challenges of our time. We will be able to provide new insights related to climate change, understand the molecular machinery of the brain, better control combustion for cleaner running engines and perform a full physics simulation of ITER to improve performance of the fusion of this reactor.

To learn more, watch this behind the scenes look at the process here at ORNL.

wells photoJack Wells is the Director of Science for the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science national user facility and is responsible for the scientific outcomes of the OLCF’s user programs. Wells has previously lead both ORNL’s Computational Materials Sciences group in the Computer Science and Mathematics Division and the Nanomaterials Theory Institute in the Center for Nanophase Materials Sciences.  Prior to joining ORNL as a Wigner Fellow in 1997, Wells was a postdoctoral fellow within the Institute for Theoretical Atomic and Molecular Physics at the Harvard-Smithsonian Center for Astrophysics.

Delft University Analyzes Genomics with Apache Spark and OpenPOWER

By Zaid Al-Ars, Cofounder, Bluebee, Chair of the OpenPOWER Foundation Personalized Medicine Working Group, and Assistant Professor at Delft University of Technology

The collaboration between the Computer Engineering Lab of the Delft University of Technology (TUDelft) and the IBM Austin Research Lab (ARL) started two years ago. Peter Hofstee invited me for a sabbatical visit to ARL to collaborate on big data challenges in the field of genomics and to investigate areas of common interest to work on together. The genomics field poses a number of challenges for high-performance computing systems and requires architectural optimizations to various subsystem components to effectively run the algorithms used in this field. Examples of such required architectural optimizations are:

  • Optimizations to the I/O subsystem, due to the large data file sizes that need to be accessed repetitively
  • Optimizations to the memory subsystem, due to the in-memory processing requirements of genomics applications
  • Optimizations to the scalability of the algorithms to utilize the available processing capacity of a cluster infrastructure.

To address these requirements, we set out to implement such genomics algorithms using a scalable big data framework that is capable of performing in-memory computation on a high performance cluster with optimized I/O subsystem.

Frank Liu and Zaid Al-Ars stand next to the ten-node POWER8 cluster running their tests

Frank Liu and Zaid Al-Ars stand next to the ten-node POWER8 cluster running their tests

Sparking the Innov8 with POWER8 University Challenge

From this starting point, we had the idea of building a high-performance system for genomics applications and enter it in the Innov8 with POWER8 University Challenge. In the process, the TUDelft would bring together various OpenPOWER technologies developed by IBM, Xilinx, Bluebee and others to create a solution for a computational challenge that has a direct impact in healthcare for cancer diagnostics as well as a scientific impact on genomics research in general. We selected Apache Spark as our big data software stack of choice, due to its scalable in-memory computing capabilities, and the easy integration it offers to a number of big data storage systems and programming APIs. However, a lot of work was needed in order to realize this solution, both related to the practicalities of installing and running Apache Spark on Power systems, something which has not yet been done at the time, as well as building the big data framework for genomics applications.

The first breakthrough came a couple of months after my sabbatical, when Tom Hubregtsen (a TUDelft student back then, working on his MSc thesis within ARL) was able to setup and run an Apache Spark implementation on a POWER8 system, by modifying and rewriting a whole host of libraries and middleware components in the software stack. Tom worked hard to achieve this important feat as a stepping-stone to his actual work on integrating Flash-based storage into the Spark software stack. He focused on CAPI connected Flash, and modified Apache Spark to spill intermediate data directly to the Flash system. The results were very promising, showing up to 70% reduction in the overhead as a result of the direct data spilling.

Building on Tom’s work, Hamid Mushtaq (a researcher in the TUDelft) successfully ran Spark on a five-node IBM Power cluster owned by the TUDelft. Hamid then continued to create a Spark-based big data framework that enables segmentation of the large data volumes used in the analysis, and enables transparent distribution of the analysis on a scalable cluster. He also made use of the in-memory computation capabilities of Spark to enable dynamic load balancing across the cluster, depending on the processing requirements of the input files. This enables efficient utilization of the available computation resources in the compute cluster. Results show that we can reduce the compute time of well-known pipelines by more than an order of magnitude, reducing the execution time from hours to minutes. This implementation is now being ported by Frank Liu at ARL on a ten-node POWER8 cluster to check for further scalability and optimization potential.

Left to right: Hamid Mushtaq, Sofia Danko and Daniel Molnar

Left to right: Hamid Mushtaq, Sofia Danko and Daniel Molnar

FPGA Acceleration

Keeping in mind the high computational requirements of the various genomics algorithms used, as well as the available parallelism in these algorithms, we identified early on the benefits of using FPGA acceleration approaches to improve the performance even further. However, it is rather challenging to use hardware acceleration in combination with Spark, something that has not yet been shown to work on any system so far, mainly because of the difficulty of integrating FPGAs into the Java-based Spark software stack. Daniel Molnar (an internship student at the TUDelft) took up this challenge and within a short amount of time was able to write native functions that connect Spark through the Java Native Interface (JNI) to FPGA hardware accelerators for specific kernels. These kernels are now being integrated and evaluated for their system requirements and the speedup they can achieve.

Improving Genomics Data Compression

Further improvements to the genomics scalable Spark pipeline are being investigated by Sofia Danko (a TUDelft PhD student), who is looking at the accuracy of the analysis on Power and proposing approaches to ensure high-quality output that can be used in a clinical environment. She is also investigating state-of-the-art genomics data compression techniques to facilitate low-cost storage and transport of DNA information. Initial results of her analysis show that specialized compression techniques can reduce the size of genomics input files to a fraction of the original size, achieving compression ratios as low as 16%.

We are excited to be part of the Innov8 university challenge. Innov8 helps the students to work as a team with shared objectives, and motivates them to achieve rather ambitious goals that have relevant societal impact they can be proud of. The team is still working to improve the results of the project, by increasing both the performance as well as the quality of the output. We are also looking forward to present our project in the IBM InterConnect 2016 conference, and to compete with other world-class universities participating in the Innov8 university challenge

zaidZaid Al-Ars is cofounder of Bluebee, where he leads the development of the Bluebee genomics solutions. Zaid is also an assistant professor at the Computer Engineering Lab of Delft University of Technology, where he leads the research and education activities of the multi/many-core research theme of the lab. Zaid is involved in groundbreaking genomics research projects such as the optimized child cancer diagnostics pipeline with University Medical Center Utrecht and de novo DNA assembly research projects of novel organisms with Leiden University.

Setting High Standards for OpenPOWER Hardware Architecture


33601413By Michael Gschwind, Chief Architect & Senior Manager, Power System Architecture, IBM

When we founded OpenPOWER to create a new inclusive ecosystem built around collaborative innovation, we knew that innovation needed to be built around a core of common standards.  To ensure interoperability of new technologies and to give assurance to hardware manufacturers, software developers, partners and customers that the choice for OpenPOWER was an investment into the future: a choice for a future with growing performance, growing markets and interoperable solutions built to last.

At the core of this open ecosystem, we needed a platform of uncompromising quality and compatibility across hardware and software upon which to build transformative solutions for a connected planet.  We were fortunate to have an unparalleled breath of skills among the founding members to set the course.  Each company had revolutionized their respective field or fields: computer graphics, accelerators, high-speed networking, innovative system design, system virtualization, modern computer architecture and design, internet content services, hyperscale data centers, cloud computing,…

To create a common reference point for the entire ecosystem, together we created the first three OpenPOWER workgroups, for Hardware Architecture, System Software, and Architecture Compliance.  We tasked these groups with identifying and standardizing the fundamental system functions that would serve as the common reference for the ecosystem.

A year has passed since the creation of the first OpenPOWER workgroups, and these workgroups have been busy setting the standards that will enable the ecosystem to grow even more.  As the Chair of the Hardware Architecture Workgroup, I am particularly delighted to share the availability of the Hardware Architecture Work Group Specification Public Review Draft for the first generation of OpenPOWER hardware architecture, and I would like to solicit your review and your feedback:

OpenPOWER ISA Profile Public Review Draft:
The purpose of the OpenPOWER Instruction Set Architecture (ISA) Profile specification is to describe the categories of the POWER ISA Version 2.07 B that are required in the OpenPOWER chip architecture for IBM POWER8 systems.
Click here to submit a comment or subscribe to the comment email distribution list.

IODA2 Specification Public Review Draft:
The purpose of the I/O Design Architecture, version 2 (IODA2) specification is to describe the chip architecture for key aspects of PCIe-based host bridge (PHB) designs for IBM POWER8 systems.
Click here to submit a comment or subscribe to the comment email distribution list.

CAIA Specification Public Review Draft:
This document defines the Coherent Accelerator Interface Architecture (CAIA) for the IBM® POWER8 ® systems. The information contained in this document allows various CAIA-compliant accelerator implementations to meet the needs of a wide variety of systems and applications. Compatibility with the CAIA allows applications and system software to migrate from one implementation to another with minor changes.

The commenting period for all three Hardware Architecture Workgroup standards track documents closes on January 10, 2016.  I want to take this opportunity to thank the over 100 members of the workgroup for their ongoing active participation and thoughtful contributions in defining these proposed OpenPOWER specifications.

NEC’s Service Acceleration Platform for Power Systems Accelerates and Scales Cloud Data Centers

By Shinji Abe, Senior Manager for IT Platform Division of NEC

As usage of the cloud expands, cloud data centers will need to be able to accommodate a wide range of services, from office applications to on-premises services and in the future: the Internet of Things (IoT). To meet these needs, the modern data center requires the ability to simultaneously handle multiple demands for data storage, networks, numerical analysis, and image processing from various users.

NEC’s new Service Acceleration Platform addresses this need by working at the device level to assign resources to perform computation and scale up individual performance and functionality. Unifying standard hardware and software components, the Service Acceleration Platform delivers faster, more powerful, and more reliable computing solutions that meet customer performance demands.

What is ExpEther?

The architecture of the Service Acceleration Platform is based on NEC-developed ExpEther technology ( ExpEther Technology can extend PCI Express beyond the confines of a computer chassis via Ethernet without any modification of existing hardware and software. Computer resources can be added on standard Ethernet fabric the same as adding into the chassis to provide scale-up flexibility. ExpEther can build a new type of computing environment without physical constraints, and is cost effective with the use of standard Ethernet.

image 1The CPU views the ExpEther engine as a PCI Express Switch rather than Ethernet. This means that the ExpEther is a implementation of PCI Express Switch, so it is fully compatible with PCI Express Spec.

Service Acceleration Platform

In IoT data processing, various data with various characteristics are generated from the physical inputs. To accelerate the processing of these inputs, various accelerator engines are necessary depending on the workload.

image 2In NEC’s Service Acceleration Platform, all IO devices are disaggregated by ExpEther. The platform can flexibly configure versatile systems with the needed number of GPGPUs, Accelerator FPGAs and NVMe SSDs according to the workload.

CAPI Capable ExpEther Engine

NEC extended the ExpEther functionality for CAPI compliance and confirmed that ExpEther technology can extend CAPI-attached devices remotely from the CPU via an Ethernet switch.

image 3Product Lineup

NEC is currently shipping 1G and 10G version of ExpEther products and developing a high-performance version for demanding environments and workloads.

image 4

Shinji Abe is a Senior Manager for IT Platform Division of NEC Corporation in Tokyo, Japan. He is in charge of development of the Service Acceleration Platform with ExpEther technology.

Genome Folding and POWER8: Accelerating Insight and Discovery in Medical Research

By Richard Talbot, Director – Big Data, Analytics and Cloud Infrastructure

No doubt, the words “surgery” and “human genome” rarely appear in the same sentence. Yet that’s what a team of research scientists in the Texas Medical Center announced recently — a new procedure designed to modify how a human genome is arranged in the nucleus of a cell in three dimensions, with extraordinary precision. Picture folding a genome almost as easily as a piece of paper.

An artist’s interpretation of chromatin folded up inside the nucleus. The artist has rendered an extraordinarly long contour into a small area, in two dimensions, by hand. Credit: Mary Ellen Scherl.

An artist’s interpretation of chromatin folded up inside the nucleus. The artist has rendered an extraordinarly long contour into a small area, in two dimensions, by hand. Credit: Mary Ellen Scherl.

This achievement, which appeared recently in the Proceedings of the National Academy of Sciences, was driven by a team of researchers led by Erez Lieberman Aiden, a geneticist and computer scientist with appointments at the Baylor College of Medicine and Rice University in Houston, and his students Adrian Sanborn and Suhas Rao. The news spread quickly across a broad range of major news sites. Because genome folding is thought to be associated with many life-altering diseases, the implications are profound. Erez said, “This work demonstrates that it is possible to modify how a genome is folded by altering a handful of genetic letters, without disturbing the surrounding DNA.”

Lurking just beneath the surface, this announcement represents a major computational achievement also. Erez and his team have been using IBM’s new POWER8 scale-out systems packed with NVIDIA Tesla K40 GPU accelerators to build a 3-D visualization of the human genome and model the reaction of the genome to this surgical procedure.

The total length of the human genome is over 3 billion base pairs (a typical measure of the size of a human or mammalian genome) and the data required to analyze a single person’s genome can easily exceed a terabyte: enough to fill a stack of CDs that is 40 feet tall. Thus, the computational requirement behind this achievement is a grand challenge of its own.

POWER8 memory bandwidth and the high octane computational horsepower of the NVIDIA Tesla Accelerated Computing Platform enabled the team to run applications that aren’t feasible on industry standard systems. Aiden said that the discoveries were possible, in part, because these systems enabled his team to analyze far more 3-D folding data than they could before.

This high performance cluster of IBM POWER8 systems, codenamed “PowerOmics”, was installed at Rice University in 2014 and made available to Rice faculty, students and collaborative research programs in the Texas Medical Center. The name “PowerOmics” was selected to portray the Life Sciences research mission of this high performance compute and storage resource for the study of large-scale, data-rich life sciences — such as genomics, proteomics and epigenomics. This high performance research computing infrastructure was made possible by a collaboration with OpenPOWER Foundation members Rice University, IBM, NVIDIA and Mellanox.

For more information:

Accelerating Key-value Stores (KVS) with FPGAs and OpenPOWER

By Michaela Blott, Principal Engineer, Xilinx Research

First, a bit of background– I lead a research team in the European headquarters of Xilinx where we look into FPGA-based solutions for data centers. We experiment with the most advanced platforms and tool flows, hence our interest in OpenPOWER. If you haven’t worked with an FPGA yet, it’s a fully programmable piece of silicon that allows you to create the perfect hardware circuit for your application thereby achieving best-in-class performance through customized data-flow architectures, as well as substantial power savings.  That means we can investigate how to make data center applications faster, smarter and greener while scrutinizing silicon features and tools flows. Our first application deep-dive was, and still is, key-value stores.

Key-value stores (KVS) are a fundamental part of today’s data center functionality. Facebook, Twitter, YouTube, flickr and many others use key-value stores to implement a tier of distributed caches for their web content to alleviate access bottlenecks on relational databases that don’t scale well. Up to 30% of data center servers implement key-value stores. But data centers are hitting a wall with performance requirements that drive trade-offs between high DRAM costs (in-memory KVS), bandwidth, and latency.

We’ve been investigating KVS stores such as memcached since 2013 [1,2]. Initially the focus was on pure acceleration and power reduction. Our work demonstrated a substantial 35x performance/power versus the fastest x86 results published at the time. The trick was to completely transform the multithreaded software implementation into a data-flow architecture inside an FPGA as shown below.

Fig 1

Figure 1: 10Gbps memcached with FPGAs

However, there were a number of limitations: First, we were not happy with the constrained amount of DRAM that can be attached to an FPGA — capacity is really important in the KVS context. Secondly, we were concerned about supporting more functionality.   For example, for protocols like Redis with its 200 commands, things can get complicated. Thirdly, we worried about ease-of-use, which is a typical adoption barrier for FPGAs. Finally, things become even more interesting once you add intelligence on top of your data: data analytics, object recognition, encryption, you name it. For this we really need a combination of compute resources that coherently shares memory. That’s exactly why OpenPOWER presented a unique and most timely opportunity to experiment with coherent interfaces.

Benchmarking CAPI

CAPI, the Coherent Accelerator Processor Interface, enables high performance and simple programming models for attaching FPGAs to POWER8 systems. First, we benchmarked PCI-E and CAPI acceleration against x86 in-memory models to determine the latency of PCI-E and CAPI. The results are explained below:


Figure 2: System level latency OpenPower with FPGA vs x86


PCI-E DMA Engines and CAPI perform significantly better than typical x86 implementations. At 1.45 microseconds, CAPI operations are so low-latency that overall system-level impact is next to negligible.  Typical x86 installations service memcached requests within a range of 100s to 1000s of microseconds. Our OpenPower CAPI installation services the same requests in 3 to 5 microseconds, as illustrated in Figure 2 (which uses a logarithmic scale).


Figure 3: PCIe vs CAPI Bandwidth over transfer sizes


Figure 3 shows measured bandwidth vs. transfer size for CAPI in comparison to a generic PCIe DMA. The numbers shown are actual measurements [4] and are representative in that PCIe performance is typically very low for small transfer sizes and next to optimal for large transfer sizes. So for small granular access, CAPI far outperforms PCIe. Because of this, CAPI provides a perfect fit for the small transfer sizes as required in the KVS scenario. For implementing object storage in host memory, we are really only interested in using CAPI in the range of transfer sizes of 128 bytes to 1kbyte. Smaller objects can be easily accommodated in FPGA-attached DRAM; larger objects can be accommodated in Flash (see also our HotStorage 2015 publication [3]).

FPGA Design

Given the promising benchmarking results, we proceeded to integrate the host memory via CAPI. For this we created a hybrid memory controller which routes and merges requests and responses between the various storage types, handles reordering, and provides a gearbox for varying access speeds and bandwidths. With these simple changes, we now have up to 1 Terabyte of coherent memory space at our disposal without loss of performance! Figure 4 shows the full implementation inside the FPGA.


Figure 4: Memcached Implementation with OpenPower and FPGA

Ease of Use

Our next biggest concern was ease of use for both FPGA design entry as well as with respect to host–accelerator integration. In regards to the latter, OpenPOWER exceeded our expectations. Using the provided API from IBM (libcxl) as well as the POWER Service Layer IP that resides within the FPGA (PSL), we completed system integration within a matter of weeks while saving huge amounts of code: 800 lines of code to be precise for x86 driver, memory allocation, and pinning, and 13.5k fewer instructions executed!

Regarding the FPGA design, it was of utmost importance to ensure that it is possible to create a fully functional and high-performing design through a high-level design flow (C/C++ at minimum), in the first instance using Xilinx’s high-level synthesis tool, Vivado HLS. The good news was that we fully succeeded in doing this and the resulting application design was fully described in C/C++, achieving a 60% reduction in lines of code (11359 RTL vs 4069 HLS lines). The surprising bonus was that we even got a resource reduction – for FPGA-savvy readers: 22% in LUTs & 30% in FFs. And let me add, just in case you are wondering, the RTL designers were at the top of their class!

The only low-level aspects left in the design flow are the basic infrastructure IP, such as memory controllers and network interfaces, which are still manually integrated. In the future, this will be fully automated through SDAccel. In other words, a full development environment that requires no further RTL development is on the horizon.



Figure 5: Demonstration at the OpenPower Summit 2015

We demonstrated the first operational prototype of this design at Impact in April 2014 and then demonstrated the fully operational demo vehicle (shown in Figure 5) including fully CAPI-enabled access to host memory at the OpenPOWER Summit in March 2015. The work is now fully integrated with IBM’s SuperVessel. In the live demonstration, the OpenPOWER system outperforms an x86 implementation by 20x (see Figure 6)!


Figure 6: Screenshot of network tester showing response traffic rates from OpenPower with FPGA acceleration versus x86 software solution


The Xilinx demo architecture enables key-value stores that can operate at 60Gbps with 2TB value-store capacity that fits within a 2U OpenPOWER Server. The architecture can be easily extended. We are actively investigating using Flash to expand value storage even further for large granular access. But most of all, we are really excited about the opportunities for this architecture when combining this basic functionality with new capabilities such as encryption, compression, data analytics, and face & object recognition!

Getting Started

  • Visit Xilinx at SC15! November 15-19, Austin, TX.
  • Learn more about POWER8 CAPI
  • Purchase a CAPI developer kit from Nallatech or AlphaData
  • License this technology through Xilinx today.  We work directly with customers and data centers to scale performance/watt in existing deployments with hardware based KVS accelerators. If you are interested in this technology, please contact us.



[1] M.Blott, K.Vissers, K.Karras, L.Liu, Z. Istvan, G.Alonso: HotCloud 2013; Achieving 10Gbps line-rate key-value stores with FPGAs

[2] M.Blott, K. Vissers: HotChips’14; Dataflow Architectures for 10Gbps Line-rate Key-value-Stores.

[3] M.Blott, K.Vissers, L.Liu: HotStorage 2015; Scaling out to a Single-Node 80Gbps Memcached Server with 40Terabytes of Memory

[4] PCIe bandwidth reference numbers were kindly provided by Noa Zilberman & Andrew Moore from Cambridge University

About Michaela Blott

Michaela Blott

Michaela Blott graduated from the University of Kaiserslautern in Germany. She worked in both research institutions (ETH and Bell Labs) as well as development organizations and was deeply involved in large scale international collaborations such as NetFPGA-10G. Today, she works as a principal engineer at the Xilinx labs in Dublin heading a team of international researchers, investigating reconfigurable computing for data centers and other new application domains. Her expertise includes data centers, high-speed networking, emerging memory technologies and distributed computing systems, with an emphasis on building complete implementations.

American Megatrends Custom Built Server Management Platform for OpenPOWER

By Christine M. Follett, Marketing Communications Manager, American Megatrends, Inc.

As one of the newest members of the OpenPOWER Foundation, we at American Megatrends, Inc. (AMI) are very excited get started and contribute to the mission and goals of the Foundation. Our President and CEO, Subramonian Shankar, who founded the company thirty years ago, shares his thoughts on joining the Foundation:

“Participating in OpenPOWER with partners such as IBM and TYAN will allow AMI to more rapidly engage as our market continues to grow, and will ensure our customers receive the industry’s most reliable and feature-rich platform management technologies. As a market leader for core server firmware and management technologies, we are eager to assist industry leaders in enabling next generation data centers as they rethink their approach to systems design.”

The primary technology that AMI is currently focusing on relative to its participation in the OpenPOWER Foundation is a full-featured server management solution called MegaRAC® SPX, in particular a custom version of this product developed for POWER8-based platforms. MegaRAC SPX for POWER8 is a powerful development framework for server management solutions composed of firmware and software components based on industry standards like IPMI 2.0, SMASH, Serial over LAN (SOL). It offers key serviceability features including remote presence, CIM profiles and advanced automation.

MegaRAC SPX for POWER8 also features a high level of modularity, with the ability to easily configure and build firmware images by selecting features through an intuitive graphical development tool chain. These features are available in independently maintained packages for superior manageability of the firmware stack. You can learn more about MegaRAC SPX at our website dedicated to AMI remote management technology here.

AMI dashboard

Foundation founding member TYAN has been an early adopter of MegaRAC SPX for POWER8, adopting it for one of their recent platforms. According to Albert Mu, Vice President of MITAC Computing Technology Corporation’s TYAN Business Unit, “AMI has been a critical partner in the development of our POWER8-based platform, the TN71-BP012, which is based on the POWER8 Architecture and provides tremendous memory capacity as well as outstanding performance that fits in datacenter, Big Data or HPC environments. We are excited that AMI has strengthened its commitment to the POWER8 ecosystem by joining the OpenPOWER Foundation.”

Founded in 1985, AMI is known worldwide for its AMIBIOS® firmware. From our start as the industry’s original independent BIOS vendor, we have evolved to become a key supplier of state-of-the-art hardware, software and utilities to top-tier manufacturers of desktop, server, mobile and embedded computing systems.

With AMI’s extensive product lines, we are uniquely positioned to provide all of the fundamental components to help OpenPOWER innovate across the system stack, providing performance, manageability, and availability for today’s modern data centers. AMI prides itself on its unique position as the only company in the industry that offers products and services based on all of these core technologies.

AMI is extremely proud to join our fellow OpenPOWER member organizations working collaboratively to build advanced server, networking, storage and acceleration technology as well as industry-leading open source software. Together we can deliver more choice, control and flexibility to developers of next-generation hyperscale and cloud data centers.

About Christine M. Follett

Christine FollettChristine M. Follett is Marketing Communications Manager for American Megatrends, Inc. (AMI). Together with the global sales and marketing team of AMI, which spans seven countries, she works to expand brand awareness and market share for the company’s diverse line of OEM, B2B/Channel and B2C technology products, including AMI’s industry leading Aptio® V UEFI BIOS firmware, innovative StorTrends® Network Storage hardware and software products, MegaRAC® remote server management tools and unique solutions based on the popular Android™ and Linux® operating systems. 

Rackspace, OpenPOWER & Open Compute: Full Speed Ahead with Barreleye

By Aaron Sullivan, Senior Director and Distinguished Engineer, Rackspace

In an open community, with great partners, it’s amazing how fast things get done.barreleye fish

At the end of 2014, Rackspace announced its affiliation with OpenPOWER. At that time, we shared our intention to build an OpenPOWER server that cut across four major open source community initiatives (OpenStack, Open Compute, OpenPOWER, and, of course, Linux).

This past spring, at the Open Compute and OpenPOWER annual summits, Rackspace offered up our vision for a more powerful cloud, and shared our “Barreleye” server concept design. (We chose to name it after the barreleye fish because as you can see from the photo above, the fish has a transparent head. Get it? It’s open!)


Alpha release of Barreleye server package; lid removed, drive tray extended.

Since then, we’ve worked closely with our partners — Avago, IBM, Mellanox, PMC, Samsung — to make that concept a reality. The first Barreleye servers came online in July, in China. In August, we shipped engineering samples to our San Antonio lab and to our development partners.

Two weeks ago, we showed Barreleye off in its first public forum: a Rackspace-hosted Open Compute engineering workshop.

OCP Workshop

Attendees at last month’s engineering workshop check out the Barreleye, the world’s first Open Compute server with an OpenPOWER chip.


L to R, bottom and top views of “Turismo” 10-core/80 hardware thread OpenPOWER processor.

Our next batch of samples will arrive in November, with more systems going to more partners shortly thereafter. We hope to submit a draft of Barreleye’s Open Compute specification before year-end, and aim to put Barreleye in our datacenters for OpenStack services early next year. Check out some close-ups, below:


Barreleye portable “lunchbox” power supply; enables benchtop testing for those without an open rack.


Barreleye hot-swappable drive tray with 15 SSDs installed.


Alpha release of Barreleye motherboard (top) and customizable IO board (bottom).

Barreleye has the capacity for phenomenal virtual machine, container, and bare metal compute services. Further out on the horizon, we’re looking forward to Barreleye’s successor on the next generation of OpenPOWER chips, and CAPI-optimized services.

Speaking of CAPI, the OpenPOWER Foundation blog is running a series on CAPI, which enables solution architects to improve system-level performance. IBM’s Sumit Gupta writes about accelerating business applications with CAPI, while Brad Brech weighs in on the benefits of using CAPI with Flash.

It’s been an incredible journey thus far. Here are some observations we’ve made along the way:

  • Turns out bugs in open source firmware — even complicated bugs that span many elements — tend to get fixed much faster. The code and functions are not hidden, meaning everyone can get involved.
  • BIOS features. Once you’ve worked with OpenPOWER’s BIOS, you’ll want it on every server.
  • Even in its first year, OpenBMC is showing great potential. Are you in DevOps? Want more control? You’ll get it with OpenBMC. Keep an eye on it.
  • Linux distribution, device driver and adapter firmware support continue to expand. At this rate, it will not be long before mainstream server adapter products are as easy to plug into OpenPOWER as any other server.
  • People are skeptical until they see it, touch it, log into it. Once they do, they’re pretty excited with Barreleye’s very impressive specs, including:
    • The memory bandwidth — around 200 GiB/sec
    • The clock speed — 3.1 – 3.7 GHz, turbo between 3.6 – 4.1
    • The cache — more than 200 MiB
    • The CPU threads — 128 – 192, utilities like “top” and “nmon” show a CPU for every thread. Even on large displays, they run right off the edge of the screen.

When we announced our participation in OpenPOWER last year, we said, “We want our systems open, all the way down. This is a big step in that direction.”

Many big steps already taken. More big steps to go. All towards a more open future. We get there faster, together.

sullivan_aaron_03About Aaron Sullivan

Aaron Sullivan is a Senior Director and Distinguished Engineer at Rackspace, focused on infrastructure strategy.

Aaron joined Rackspace’s Product Development organization in late 2008, in an engineering role, focused on servers, storage, and operating systems. He moved to Rackspace’s Supply Chain/Business Operations organization in 2010, mostly focused on next generation storage and datacenters. He became a Principal Engineer during 2011 and a Director in 2012, supporting a variety of initiatives, including the development and launch of Rackspace’s first Open Compute platforms. He recently advanced to the role of Senior Director and Distinguished Engineer. These days, he spends most of his time working on next generation server technology, designing infrastructure for Rackspace’s Product and Practice Areas, and supporting the growth and capabilities of Rackspace’s Global Infrastructure Engineering team. He also frequently represents Rackspace as a public speaker, writer, and commentator.

He was involved with Open Compute since its start at Rackspace. He became formally involved in late 2012. He is Rackspace’s lead for OCP initiatives and platform designs. Aaron is serving his second term as an OCP Incubation Committee member, and sponsors the Certification & Interoperability (C&I) project workgroup. He supported the C&I workgroup as they built and submitted their first test specifications. He has also spent some time working with the OCP Foundation on licensing and other strategic initiatives.

Aaron previously spent time at GE, SBC, and AT&T. Over the last 17 years, he’s touched more technology than he cares to talk about. When he’s not working, he enjoys reading science and history, spending time with his wife and children, and a little solitude.

Interconnect Your Future with Mellanox 100Gb EDR Interconnects and CAPI

By Scot Schultz Director of HPC and Technical Computing, Mellanox

Business Challenge

Some computing jobs are so large that they must be split into pieces and solved in parallel, distributed via the network across a number of computing nodes. We find some of the world’s largest computing jobs in the realm of scientific research, where continuous advancement will require extreme-scale computing with machines that are 500-to-1000 times more capable than today’s supercomputers. As researchers constantly refine their models and push to increased resolutions, the demand for more parallel computation and advanced networking capabilities is paramount.

Computing Challenge

Efficient high-performance computing systems require high-bandwidth, low-latency connections between thousands of multi-processor nodes, as well as high-speed storage systems. As a result of the ubiquitous data explosion and the ascendance of Big Data, especially unstructured data, today’s systems need to move enormous amounts of data as well as perform more sophisticated analysis.

The network now becomes the critical element of gaining insight from today’s massive flows of data.


Only Mellanox delivers an industry standards based solutions with advanced native hardware acceleration engines, but leveraging the latest advancement from IBM’s OpenPOWER architecture takes performance to whole new level.

Already deployed in over 50% of the world’s most powerful super computing systems, Mellanox’s high speed interconnect solutions are proven to deliver the highest scalability, efficiency, and unmatched performance for HPC systems. The latest Mellanox EDR 100Gb/s interconnect architecture includes native support for one of the newest innovations brought forth by OpenPOWER, the Coherent Accelerator Processor Interface (CAPI).

Mellanox 100Gb/s ConnectX®-4 architecture with native support for CAPI is capable of handling communications of massive parallelism. By delivering up to 100Gb/s of reliable, zero-loss connectivity, ConnectX-4 with CAPI provides an optimized platform for moving enormous volumes of data. With much tighter integration between the Mellanox high-performance interconnect and the processor, POWER-based systems can rip through high volumes of data and bring compute and data closer together to derive greater insights. Mellanox ConnectX-4 can be leveraged for 100Gb CAPI-attached InfiniBand, Ethernet, or storage.


CAPI Interconnects with Mellanox Data Flow

CAPI also simplifies the memory management between interconnect and CPU – which results in reduced overhead, higher performance and increased scalability. Because CAPI provides a level of integration that removes additional latency compared to platforms featuring traditional PCI-Express bus semantics, the Mellanox interconnect can move data in and out of the system with even greater efficiency.

Back to tackling the world’s toughest scientific problems –Mellanox ConnectX-4 EDR 100Gb/s “Smart” interconnect technology and IBM’s POWER architecture with CAPI can help. Oak Ridge National Laboratory and Lawrence Livermore National Laboratory for example, have chosen solutions utilizing OpenPOWER designs developed by Mellanox, IBM, and NVIDIA– for the Department of Energy’s next generation Summit and Sierra supercomputer systems. Summit and Sierra will deliver raw computing power at more than 100 petaflops at peak performance, which will make them the most powerful computers in world.

From innovation in nanotechnologies, climate research, medical research and discovering renewable energies, Mellanox and members of the OpenPOWER ecosystem are leading innovations in high performance computing.

Learn more about Mellanox 100Gb/s and CAPI

Mellanox CAPI attached interconnects are suitable for the largest deployments, but they are also accessible for more modest clusters, clouds, and commercial datacenters. Here are a few ways to get started.

Keep coming to see blog posts from IBM and other OpenPOWER Foundation partners on how you can use CAPI to accelerate computing, networking and storage.

About Scot Schultz

Scot Schultz, MellanoxScot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in March 2013 as Director of HPC and Technical Computing, Schultz is 25-year veteran of the computing industry. Scot currently maintains his role as Director of Educational Outreach, founding member of the HPC Advisory Council and of various other industry organizations. Follow him on Twitter: @ScotSchultz

Using CAPI and Flash for larger, faster NoSQL and analytics

By Brad Brech, Distinguished Engineer, IBM Power Systems Solutions

CAPI Flash Benefits InfographicBusiness Challenge

Suppose you’re a game developer with a release coming up. If things go well, your user base could go from zero to hundreds of thousands in no time. And these gamers expect your app to capture and store their data, so the game always knows who’s playing and their progress in the game, no matter where they log in. You’re implementing an underlying database to serve these needs.

Oh—and you’ve got to do that without adding costly DRAM to existing systems, and without much of a budget to build a brand-new large shared memory or distributed multi-node database solution. Don’t forget that you can’t let your performance get bogged down with IO latency from a traditionally attached flash storage array.

More and more, companies are choosing NoSQL over traditional relational databases. NoSQL offers simple data models, scalability, and exceptionally speedy access to in-memory data. Of particular interest to companies running complex workloads is NoSQL’s high availability for key value stores (KVS) like Redis and MemcacheDB, document stores such as mongoDB and couchDB, and column stores Cassandra and BigTable.

Computing Challenge

NoSQL isn’t headache-free.

Running NoSQL workloads fast enough to get actionable insights from them is expensive and complex. That requires your business either to invest heavily in a shared-memory system or to set up a multi-node networked solution that adds complexity and latency when accessing your valuable data.

Back to our game developer and their demanding gamers. As the world moves to the cloud, developers need to offer users rapid access to online content, often tagged with metadata. Metadata needs low response times as it is constantly being accessed by users. NoSQL provides flexibility for content-driven applications to not only provide fast access to data but also store diverse data sets. That makes our game developer an excellent candidate for using CAPI-attached Flash to power a NoSQL database.

The Solution

Here’s where CAPI comes in. Because CAPI allows you to attach devices with memory coherency at incredibly low latency, you can use CAPI to affix flash storage that functions more like extended block system memory for larger, faster NoSQL. Coming together, OpenPOWER Foundation technology innovators including Redis Labs, Canonical, and IBM created this brilliant new deployment model, and they built Data Engine for NoSQL—one of the first commercially available CAPI solutions.

CAPI-attached flash enables great things. By CAPI-attaching a 56 TB flash storage array to the POWER8 CPU via an FPGA, the application gets direct access to a large flash array with reduced I/O latency and overhead compared to standard I/O-attached flash. End-users can:

  • Create a fast path to a vast store of memory
  • Reduce latency by cutting the number of code instructions to retrieve data from 20,000 to as low as 2000, by eliminating I/O overhead1
  • Increase performance by increasing bandwidth by up to 5X on a per-thread basis1
  • Lower deployment costs by 3X through massive infrastructure consolidation2
  • Cut TCO with infrastructure consolidation by shrinking the number of nodes needed from 24 to 12

Get Started with Data Engine for NoSQL

Getting started is easy, and our goal is to provide you with the resources you need to begin. This living list will continue to evolve as we provide you with more guidance, information, and use cases, so keep coming back to be sure you can stay up to date.

Learn more about the Data Engine for NoSQL:

Deploy Data Engine for NoSQL:

Keep coming to see blog posts from IBM and other OpenPOWER Foundation partners on how you can use CAPI to accelerate computing, networking and storage.


BradBrechAbout Brad Brech

Brad Brech is a Distinguished Engineer and the CTO of POWER Solutions in the IBM Systems Division. He is currently focused on POWER and OpenPOWER and solutions and is the Chief Engineer for the CAPI attached Flash solution enabler. His responsibilities include technical strategy, solution identification, and working delivery strategies with solutions teams. Brad is an IBM Distinguished Engineer, a member of the IBM Academy of Technology and past Board member of The Green Grid.

[1]Based on performance analysis comparing typical I/O Model flow (PCIe) to CAPI Attached Coherent Model flow.

[2] Based on competitive system configuration cost comparisons by IBM and Redis Labs.

Accelerating Business Applications in the Data-Driven Enterprise with CAPI

By Sumit Gupta, VP, HPC & OpenPOWER Operations at IBM
This blog is part of a series:
Pt 2: Using CAPI and Flash for larger, faster NoSQL and analytics
Pt 3: Interconnect Your Future with Mellanox 100Gb EDR Interconnects and CAPI
Pt 4: Accelerating Key-value Stores (KVS) with FPGAs and OpenPOWER

Every 48 hours, the world generates as much data as it did from the beginning of recorded history through 2003.

The monumental increase in the flow of data represents an untapped source of insight for data-driven enterprises, and drives increasing pressure on computing systems to endure and analyze it. But today, just raising processor speeds isn’t enough. The data-driven economy demands a computing model that delivers equally data-driven insights and breakthroughs at the speed the market demands.

CAPI LogoOpenPOWER architecture includes a technology called Coherent Accelerator Processor Interface (CAPI) that enables systems to crunch through the high volume of data by bringing compute and data closer together. CAPI is an interface that enables close integration of devices with the POWER CPU and gives coherent access to system memory. CAPI allows system architects to deploy acceleration in novel ways for an application and allow them to rethink traditional system designs.

CAPI-attached vs. traditional acceleration

CAPI allows attached accelerators to deeply integrate with POWER CPUs

CAPI-attached acceleration has three pillars: accelerated computing, accelerated storage, and accelerated networking. Connected coherently to a POWER CPU to give them direct access to the CPU’s system memory, these techniques leverage accelerators like FPGAs and GPUs, storage devices like flash, and networking devices like Infiniband.   These devices, connected via CAPI, are programmable using simple library calls that enable developers to modify their applications to more easily take advantage of accelerators, storage, and networking devices. The CAPI interface is available to members of the OpenPOWER foundation and other interested developers, and enables a rich ecosystem of data center technology providers to integrate tightly with POWER CPUs to accelerate applications.

What can CAPI do?

CAPI has had an immediate effect in all kinds of industries and for all kinds of clients:

  • Healthcare: Create customized cancer treatment plans personalized to an individual’s unique genetic make-up.
  • Image and video processing: Facial expression recognition that allows retailers to analyze the facial reactions their shoppers have to their products.
  • Database acceleration and fast storage: accelerate the performance of flash storage to allow users to search databases in near real-time for a fraction of the cost.
  • Risk Analysis in Finance: Allow financial firms to monitor their risk in real-time with greater accuracy.

The CAPI Advantage

CAPI can be used to:

  • Accelerate Compute by leveraging a CAPI-attached FPGA to run, for example, Monte Carlo analysis or perform Vision Processing. The access to the IBM POWER CPU’s memory address space is a programmer’s dream.
  • Accelerate Storage by using CAPI to attach flash that can be written to as a massive memory space instead of storage—a process that slashes latency compared to traditional storage IO.
  • Accelerate Networking by deploying CAPI-attached network accelerators for faster, lower latency edge-of-network analytics.

The intelligent and close integration enabled by CAPI with IBM POWER CPUs removes much of the latency associated with the I/O bus on other platforms (PCI-E). It also makes the accelerator a peer to the POWER CPU cores, allowing it to access the accelerator natively.  Consequently, a very small investment can help your system perform better than ever.

Supported by the OpenPOWER Foundation Community

We often see breakthroughs when businesses open their products to developers, inviting them to innovate. To this end IBM helped create the OpenPOWER Foundation, now with 150 members, dedicated to innovating around the POWER CPU architecture.

IBM and many of our Foundation partners are committed to developing unique, differentiated solutions leveraging CAPI. Many more general and industry-specific solutions are on the horizon. By bringing together brilliant minds from our community of innovators, the possibilities for customers utilizing CAPI technology are endless.

Get Started with CAPI

Getting started with CAPI is easy, and our goal is to provide you with the resources you need to begin. This living list will continue to evolve as we provide you with more guidance, information, and use cases, so keep coming back to be sure you can stay up to date.

  1. Learn more about CAPI:
  2. Get the developer kits:
  3. Get support for your solutions and share results with your peers on the CAPI Developer Community

Along the way reach out to us on Twitter, Facebook, and LinkedIn.

This blog is part of a series:
Pt 2: Using CAPI and Flash for larger, faster NoSQL and analytics
Pt 3: Interconnect Your Future with Mellanox 100Gb EDR Interconnects and CAPI
Pt 4: Accelerating Key-value Stores (KVS) with FPGAs and OpenPOWER

Sumit GuptaAbout Sumit Gupta

Sumit Gupta is Vice President, High Performance Computing (HPC) Business Line Executive and OpenPOWER Operations. With more than 20 years of experience, Sumit is a recognized industry expert in the fields of HPC and enterprise data center computing. He is responsible for business management of IBM’s HPC business and for operations of IBM’s OpenPOWER initiative.

Liquid Cooling for OpenPOWER: Asetek Accelerates the Performance of OpenPOWER Platforms

By Larry Vertal, Data Center Marketing, Asetek

Asetek Liquid Cooler for POWER8Asetek® joined the OpenPOWER™ Foundation in July of this year with great enthusiasm. As the world’s leading provider of liquid cooling systems for CPU and GPUs, with over 2 million units sold, Asetek knows it can bring a lot to OpenPOWER designs and enable the community to productize the highest performance systems and clusters leveraging liquid cooling for OpenPOWER.

Asetek is already engaged in delivering liquid cooling designs that accelerate the performance of OpenPOWER platforms. At the 2015 International Supercomputing Conference (ISC15) in Frankfurt, Germany, in July of this year, Asetek provided the first public showing of a liquid cooling system for POWER8 processors. Particularly interesting about this design is that it enables POWER8 server nodes to utilize the highest performing overclocked Power processors without concerns for throttling.

Given Asetek’s history in enabling Top500 HPC sites, the current cutting edge performance and expected enhancements to POWER processors will likely demonstrate a need for liquid cooling to provide non-throttling clusters with extreme rack densities.

Continue reading

Opening the Flood Gates- OpenPOWER Two Years Later

By Brad McCredie, President, OpenPOWER Foundation

The journey of a thousand miles begins with one step. – Lao Tzu, Chinese Philosopher

Two years ago, on August 6, 2013, IBM, along with Google, Tyan, NVIDIA and Mellanox, came together to announce the creation of OpenPOWER with the goal of building a worldwide collaborative ecosystem based on IBM’s POWER architecture. This bold move reversed the ongoing trend of data center architectures becoming increasingly closed.  IBM built a broad partnership with technology providers and clients to build an open data center platform that would allow collaboration and foster innovation. Nobody knew exactly what would happen next, but we all knew that for better or worse we would be turning the world of IT infrastructure on its head. Continue reading Uses Ubuntu to Bring OpenPOWER-based Systems to the Public Cloud

By Randall Ross, Ubuntu

Recently, the German IT company, which specializes in providing hosting, cloud and web development services based on open source technologies announced they are adding more power (excuse the pun) to their OpenStack public cloud service, teutoStack Public Cloud, which had previously been built exclusively on proprietary hardware. As a long-term Ubuntu Cloud Partner, and Ubuntu Advantage Reseller, was delighted when Canonical expanded their platform to support OpenPOWER-based POWER8 systems.

By working with OpenPOWER-based technology, fueled by collaborative innovation, teutoStack Public Cloud can deliver on growing expectations in the highly competitive cloud market. It now brings new capabilities within the reach of more companies as OpenPOWER price/performance advantage lowers the barrier for compute intensive workloads, such as analytics.

The combination of Ubuntu, Juju, and MAAS as key components in this new OpenPOWER-based public cloud offering is exciting, as it provides customers with real choice. They can now enjoy much higher levels of performance for analytics and other resource-hungry workloads. They can also experience the benefits of higher node density, which translates to an excellent return on infrastructure spend: a smaller server footprint, lower energy costs and a more environmentally friendly business. Best of all, they can do this without changing how they work with the cloud. OpenPOWER-based technology may be under the hood, but OpenStack is still the interface, and Juju is still the service modeler.

The combination of Ubuntu with the OpenPOWER platform has also provided impressive reliability to The company can now easily model, provision, build, manage and support its cloud at scale. It has created the ideal platform to support its new range of cloud services, optimized to support capabilities such as analytics, where they are seeing a significant boost in memory performance.

Based on the positive response from clients, is planning to integrate more OpenPOWER-based POWER8 servers into the teutoStack Public Cloud and eventually migrate additional OpenStack core services to POWER8 for higher performance. Customers like GRAU DATA AG, a data storage company, are already using the teutoStack Public Cloud for testing and delivering their own applications on the OpenPOWER platform with higher performance.
It is refreshing to see more and more OpenPOWER solutions coming to market every day, and all the hard work of the OpenPOWER Foundation members, including Canonical, paying off for companies like and their customers. Ubuntu has always been focused on giving people choice and access to the best technology. Now, with OpenPOWER, we have a new and exciting way to do that.


randall.002About Randall Ross

Randall Ross is an Ubuntu Community Manager with Canonical. He is passionate about all things POWER and works to help grow the community that wants to make Ubuntu and OpenPOWER based solutions that have big impact. Randall leads the OpenPOWER Foundation’s Integrated Solutions workgroup. Prior to joining Canonical, Randall has enjoyed over 20 years working in various IT management and consulting roles to ensure that technology solutions match business needs. He has also built and manages one of the largest Ubuntu face-to-face communities in his home city of Vancouver, Canada.

Imperial College London and IBM Join Forces to Accelerate Personalized Medicine Research within the OpenPOWER Ecosystem

By Dr. Jane Yu, Solution Architect for Healthcare & Life Sciences, IBM

When the Human Genome Project was completed in 2003, it was a watershed moment for the healthcare and life science industry. It marked the beginning of a new era of personalized medicine where the treatment of disease could be tailored to the unique genetic code of individual patients.

We’re closer than ever to fully tailored treatment. To accelerate advances in personalized medicine research, IBM Systems is partnering with the Data Science Institute of Imperial College London (DSI) and its leading team of bioinformatics and data analysis experts. At the heart of this collaboration is tranSMART, an open-source data warehouse and knowledge management system that has already been adopted by commercial and academic research organizations worldwide as a preferred platform for integrating, accessing, analyzing, and sharing clinical and genomic data on very large patient populations. DSI and IBM Systems will be partnering to enhance the performance of the tranSMART environment within the OpenPOWER ecosystem by taking advantage of the speed and scalability of IBM POWER8 server technology, IBM Spectrum Scale storage, and IBM Platform workload management software.

At ISC 2015 in Frankfurt, representatives from Imperial College DSI and IBM Systems will be demonstrating an early prototype of a personalized medicine research environment in which tranSMART is directly linked to IBM text analytics for mining curated scientific literature on POWER8. For a demonstration, please visit us at IBM booth #928 at ISC.

How did we get here? In recent years, the advent of Next Generation Sequencing (NGS) technologies has significantly reduced the cost and time required to sequence whole human genomes: It took roughly $3B USD to sequence the first human genome across 13 years of laboratory bench work; today, a single human genome can be sequenced for roughly $1,000 USD in less than a day.

The task of discovering new medicines and related diagnostics based on genomic information requires a clear understanding of the impact that individual sequence variations have on clinical outcomes. Such associations must be analyzed in the context of prior medical histories and other environmental factors. But this is a computationally daunting task: deriving such insights require scientists to access, process, and analyze genomic sequences, longitudinal patient medical records, biomedical images, and other complex, information-rich data sources securely within a single compute and storage environment. Scientists may also want to leverage the corpus of peer-reviewed scientific literature that may already exist about the genes and molecular pathways influencing the disease under study. Computational workloads must be performed across thousands of very large files containing heterogeneous data, where just a single file containing genomic sequence data alone can be on the order of hundreds of megabytes. Moreover, biological and clinical information critical to the study must be mined from natural language, medical images, and other non-traditional unstructured data types at very large scale.

As drug development efforts continue to shift to increasingly complex and/or exceedingly rare disease targets, the cost of bringing a drug to market is projected to top $2.5B USD in 2015, up from about $1B USD in 2001. The ability of government, commercial, and academic research organizations to innovate in personalized medicine requires that the compute-intensive workloads essential to these efforts run reliably and efficiently. IBM Systems has the tools to deliver.

The high-performance compute and storage architecture must have the flexibility to address the application needs of individual researchers, the speed and scale to process rapidly expanding stores of multimodal data within competitive time windows, and the smarts to extract facts from even the most complex unstructured information sources. The financial viability of these initiatives depends on it. The tranSMART environment addresses each of these critical areas.

Code which demonstrates marked improvements in the performance and scalability of tranSMART on POWER systems will be donated back to the tranSMART open-source community. Early performance gains have already been seen on POWER8. In addition, IBM Systems will be working with DSI, IBM Watson, and other IBM divisions to enable large-scale text analytics, natural language processing, machine learning, and data federation capabilities within the tranSMART – POWER analytical environment.

We look forward to seeing you at ISC to show you how OpenPOWER’s HPC capabilities are helping to improve personalized medicine and healthcare.

About Dr. Jane Yu

Jane Yu, MD, PhD is a Worldwide Solution Architect for Healthcare & Life Science within IBM Systems. Dr. Yu has more than 20 years of experience spanning clinical healthcare, biomedical research, and advanced analytics. Since joining IBM in 2011, Dr. Yu has been building on-premise and cloud-based data management and analytics systems that enable leading edge clinical and basic science research. She holds an MD and a PhD in Biomedical Engineering from Johns Hopkins University School of Medicine, and a Bachelor of Science in Aeronautics & Astronautics from the Massachusetts Institute of Technology.


Joseph A. DiMasi, slides: “Innovation in the Pharmaceutical Industry: New Estimates of R&D Costs,” Tufts Center for the Study of Drug Development, November 18, 2014.

PGI OpenPOWER+Tesla Compilers Demo’ing at ISC’15

By Patrick Brooks, PGI Product Marketing Manager

Last November at Supercomputing 2014, we announced that the PGI compilers for high-performance computing were coming to the OpenPOWER platform. These compilers will be used on the U.S. Department of Energy CORAL systems being built by IBM, and will be generally available on OpenPOWER systems in 2016. PGI compilers on OpenPOWER offer a user interface, language features, programming models and optimizations identical to PGI on Linux/x86. Any HPC application you are currently running on x86+Tesla using PGI compilers will re-compile and run with few or no modifications on OpenPOWER+Tesla, making your applications portable to any HPC systems in the data center based on OpenPOWER or x86 CPUs, with or without attached NVIDIA GPU compute accelerators. PGI presented on this in detail at the OpenPOWER foundation summit in March.

At ISC’15 in Frankfurt, Germany July 14-17, you can get a first peak at the PGI compilers for OpenPOWER at the PGI stand (#1051) in the ISC exhibition hall. An early version of the compilers is already in use at Oak Ridge National Laboratory (ORNL), one of the two DOE sites where the IBM-developed CORAL supercomputers will be installed. For the ISC demo, the PGI Accelerator C/C++ compilers are being shown running on a remote IBM S824L OpenPOWER Linux server with an attached NVIDIA Tesla K40 GPU. These are pre-production PGI compilers, but all GCC 4.9 compatibility features, OpenACC 2.0 features and interoperability with CUDA Unified Memory are enabled. The system is running Ubuntu Linux and NVIDIA CUDA 7.0.

These compilers are being developed for future generation, closely coupled IBM OpenPOWER CPUs and NVIDIA GPUs. To demonstrate the capabilities they already have, PGI is showing how its pgc++ compiler for OpenPOWER can build an OpenACC version of the well-known Lulesh Hydrodynamics Proxy application (mini-app). Lulesh was developed at the Lawrence Livermore National Laboratory (LLNL), which is the other site where the IBM-developed CORAL supercomputers will be installed.

Like most mini-apps, Lulesh is a relatively small code of only a few thousand lines, so it can be built and executed pretty quickly. Within those few thousand lines of code, 45 OpenACC pragmas are sprinkled in to enable parallel execution. Any C++ compiler that doesn’t implement OpenACC extensions ignores the pragmas, but with an OpenACC-enbled compiler like the one from PGI, they enable parallelization and offloading of compute intensive loops for execution on the NVIDIA K40 GPU. Here’s what a section of the Lulush code with OpenACC pragmas looks like:

3267 Real_t *pHalfStep = Allocate<Real_t>(length) ;
3269 #pragma acc parallel loop
3270 for (Index_t i = 0 ; i < length ; ++i) {
3271 e_new[i] = e_old[i] - Real_t(0.5) * delvc[i] * (p_old[i] + q_old[i])
3272 + Real_t(0.5) * work[i];
3274 if (e_new[i] < emin ) {
3275 e_new[i] = emin ;
3276 }
3277 }
3279 CalcPressureForElems(pHalfStep, bvc, pbvc, e_new, compHalfStep, vnewc,
3280 pmin, p_cut, eosvmax, length, regElemList);
3282 #pragma acc parallel loop
3283 for (Index_t i = 0 ; i < length ; ++i) {
3284 Real_t vhalf = Real_t(1.) / (Real_t(1.) + compHalfStep[i]) ;
3286 if ( delvc[i] > Real_t(0.) ) {
3287 q_new[i] /* = qq_old[i] = ql_old[i] */ = Real_t(0.) ;
3288 }
3289 else {
3290 Real_t ssc = ( pbvc[i] * e_new[i]
3291 + vhalf * vhalf * bvc[i] * pHalfStep[i] ) / rho0 ;
3293 if ( ssc <= Real_t(.1111111e-36) ) {
3294 ssc = Real_t(.3333333e-18) ;
3295 } else {
3296 ssc = SQRT(ssc) ;
3297 }
3299 q_new[i] = (ssc*ql_old[i] + qq_old[i]) ;
3300 }
3302 e_new[i] = e_new[i] + Real_t(0.5) * delvc[i]
3303 * ( Real_t(3.0)*(p_old[i] + q_old[i])
3304 - Real_t(4.0)*(pHalfStep[i] + q_new[i])) ;
3305 }

The PGI compilers have a nice feature where they report back to the user whether and how loops are parallelized, and give advice on how source code might be modified to enable more or better parallelization or optimization. When the above loops are compiled, the corresponding messages emitted by the compiler look as follows:

3267, Accelerator kernel generated
3270, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
3267, Generating copyout(e_new[:length])

Generating copyin(delvc[:length],p_old[:length],q_old[:length],e_old[:length],work[:length])

Generating Tesla code

3279, Accelerator kernel generated
3283, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
3279, Generating copyin(p_old[:length],q_old[:length],pHalfStep[:length],bvc[:length])

Generating copy(e_new[:length])

Generating copyin(pbvc[:length],qq_old[:length],ql_old[:length],delvc[:length],compHalfStep[:length])

Generating copy(q_new[:length])

Generating Tesla code

The compiler generates an executable that includes both OpenPOWER CPU code and GPU-optimized code for any loops marked with OpenACC pragmas. It is a single executable usable on any Linux/OpenPOWER system, but which will offload loops for acceleration on any such system that incorporates NVIDIA GPUs. You can see in the messages that the PGI compiler is generating copyin/copyout calls to a runtime library that moves data back and forth between CPU memory and GPU memory. However, in this case the code is compiled to take advantage of CUDA Unified Memory, and when the executable is run those calls are ignored and the system automatically moves data back and forth. When the lulesh executable is run on the IBM S824L system, the output looks as follows:

tuleta1% make run150

./lulesh2.0 -s 150 -i 100

Running problem size 150^3 per domain until completion

Num processors: 1

Total number of elements: 3375000
To run other sizes, use -s <integer>.

To run a fixed number of iterations, use -i <integer>.

To run a more or less balanced region set, use -b <integer>.

To change the relative costs of regions, use -c <integer>.

To print out progress, use -p

To write an output file for VisIt, use -v

See help (-h) for more options
Run completed:

Problem size = 150

MPI tasks = 1

Iteration count = 100

Final Origin Energy = 1.653340e+08

Testing Plane 0 of Energy Array on rank 0:

MaxAbsDiff = 1.583248e-08

TotalAbsDiff = 7.488936e-08

MaxRelDiff = 8.368586e-14

If you’re at ISC this week, stop by to see the demo live and give us your feedback. We’re working to add full support for Fortran and all of our remaining programming model features and optimizations, and plan to show another demo of these compilers at the conference this coming November in Austin, Texas. Soon thereafter in 2016, more and more HPC programmers will be able to port their existing PGI-compiled x86 and x86+Tesla applications to OpenPOWER+Tesla systems quickly and easily, with all the same PGI features and user interface across both platforms.

We’ll keep you posted on our progress.

About Pat Brooks

Patrick Brooks has been a Product Marketing Manager at PGI since 2004. Previously, he held several positions in technology marketing including product and marketing management at Intel and Micron, account management at Regis McKenna and independent consultant.

SuperVessel: Come Create the Era of Heterogeneous Computing with FPGAs for the Cloud

Michael_LeventhalBy: Michael Leventhal, Technical Manager, Data Center Acceleration, Xilinx

Xilinx believes that we, in collaboration with the OpenPOWER Foundation, are spearheading a new era of computing, one that is capable of expanding human potential. In a word, it is the cloud — which delivers compute capacity and data intelligence to the fingertips of billions. Big data and raw compute capacity have enabled the creation of computing solutions that are categorically different than anything ever done.

These developments have paved the way for systems that can understand speech, translate languages, recognize individuals, interpret actions in video streams and even autonomously drive cars. However, based on the 75-year old Von Neumann architecture, compute infrastructure alone cannot handle this work within reasonable limits of cost, space, power, and system complexity. In a very short time, this venerable architecture will be unable to meet the taxing demands of the cloud.

Soon, heterogeneous computing will be the new paradigm equipped to handle these complex workloads. Heterogeneous computing combines CPUs and compute engines with innovative architectures, which will be considerably more efficient for new era cloud workloads. Now, thanks to the IBM POWER Coherent Accelerator Processor Interface (CAPI), Xilinx FPGAs are dynamically hardware-configured to efficiently run cloud applications with an IBM POWER processor and share coherent access to host memory between the processor and the FPGA.

The Von Neumann architecture has been refined over decades and the compute applications that run on it are highly optimized to run efficiently. Heterogeneous computing has been developing rapidly over the last decade, but there is still a great amount of research and development needed before reaching its full potential. This is one of the most critical areas of computing research today.

Xilinx is committed to supporting this research and development to help redefine the future of heterogeneous computing. That’s why we joined forces with IBM to design POWER processors with FPGAs attached and enable researchers, students, and developers in the community to leverage OpenPOWER and Xilinx development tools through SuperVessel. This open access cloud service, which was created by IBM Research Beijing and IBM Systems Labs, is now provisioned with CAPI-compatible FPGAs, providing a complete virtual R&D engine for the creation and testing of cloud applications in areas such as deep analytics, machine learning and IoT.

To further educate developers, Xilinx collaborated with several universities to organize the first international workshop on High Performance Heterogeneous Reconfigurable Computing (H²RC) at SC15 ( This will mark the first time an FPGA-focused workshop aimed at the heterogeneous computing community will be held at the supercomputing conference. We’d like to invite you to take advantage of the resources available through SuperVessel and share your experience with the community at H²RC.

See you in the Cloud and see you at SC15!

About Michael Leventhal, Technical Manager, Data Center Acceleration, Xilinx

Michael is responsible for leadership in the compute acceleration sector of Xilinx’s data center business unit. He has more than a decade of experience in co-processing engines and acceleration with reconfigurable logic, software, and design tools in a wide range of application domains as an inventor, technologist, product manager, and marketer.  He holds a BS-EECS degree from U.C. Berkeley.

Mellanox and the OpenPOWER EcoSystem to Help Generate Economic Growth

Scot Schlultz

In the latest announcement (UK Government Invests £115 Million in Big Data and Cognitive Computing Research with STFC and IBM), the STFC Hartree Center is setting out to enable the latest in world-class, state-of-the-art technologies for the development of advanced software solutions to solve real world challenges in academe, industry and Government and tackle the ever growing issues of big data.

The architecture will include POWER CPUs from IBM, the latest in flash-memory storage, GPU’s from Nvidia and of course the most advanced networking technology from Mellanox.  Enhanced with native support for CAPI technology, and network-offload acceleration capabilities, the Mellanox interconnect will rapidly shuttle data around the system in the most effective and efficient manner to keep the cores focused on crunching the data; not on processing network communications.

Since the inception of the OpenPOWER Foundation, Mellanox has been an active Platinum member with shared goals to collaborate with technology leaders and end users around the world to develop hardware and software solutions that are far superior in tackling the ever changing complexities of today’s problem-sets.

Hartree Centre, a well-established source of innovation with leading computational scientists, data scientists and software developers will now have the leading edge capabilities to help them produce better outcomes to the challenges they tackle every day. For example, the Hartree Centre is already helping businesses like Unilever and Glaxo SmithKline use high performance computing to improve the stability of home products such as fabric softeners and to pinpoint links between genes and diseases.

We are excited for this latest collaboration and look forward to the great work that is to come from Hartree Centre as well as the OpenPOWER Foundation.

The Open Secret Behind the Success of OpenPOWER

By Brad McCredie, President

The release this week of Intel’s new 18-core Haswell-EX chip, gives us an opportunity to gauge how OpenPOWER technology and our “co-opetition” business model are stacking up. It’s an open secret. In just little over a year, with more than 10 new collaboratively built hardware solutions and growing, the OpenPOWER Foundation is reimagining the data center and leading our industry into a new era, dominated by hyperscale clouds and analytics on huge datasets.

When we were founded in 2013, some in the industry were skeptical of our approach – even comparing us to OpenSPARC, a technology that is generally acknowledged to have underperformed. But where OpenSPARC gave away old technology and never really focused on building a strong ecosystem, OpenPOWER shares new technology, has an industry-led ecosystem of more than 100 members, and is built around the first system developed for the most modern, workloads and deployment models. It should also be noted that while OpenSPARC and Intel’s proprietary product line were among the many options back when Moore’s Law appeared to be perpetually sustainable, OpenPOWER is emerging as Moore’s Law approaches its limit, and the industry is eager for alternative choices. As a study by a leading industry analyst put it last year, POWER8, the IBM architecture that is the basic building block of OpenPOWER, “offers a viable alternative to Intel’s market-leading products…and is energizing the OpenPOWER Foundation.” Those who make the point that the OpenPOWER approach has been tried before are right. But while OpenPOWER has dared to go where others have attempted to go before, it is the first model to get it right. In short, OpenPOWER is the wave of the future, and there’s no turning back. The industry is voting with its feet and its dollars.

Power Systems lead the global Big Data and analytics market worldwide and are the top choice for scalable systems. Globally, nine of the top 10 banks, and 8 of the top 10 retailers run Power systems.

OpenPOWER’s success is not due solely to its innovative business model. We have been able to marry business model innovation with technology innovation to deliver choice, freedom and superior performance demanded by clients around the world. So, how do our specs stack up?

By any measure, POWER8 processors offer more memory, more threads, more bandwidth and more cache than Intel. Built for Big Data, Power Systems offer virtualization without limits and security without doubt. They are optimized to run core, mission-critical applications alongside emerging business applications, and they offer efficient, cost-effective and simple-to-manage clouds.

Finally, as an independent analyst recently noted, “pricing is no contest,” with Power chips averaging about half the cost of Intel chips.

While the single company led, closed, proprietary microprocessor model is fighting to maintain its foothold in the industry; it is no longer the only game in town. OpenPOWER is a bold, unprecedented move that is industry led, community driven and gaining momentum. Congratulations to Intel for the introduction of its new chip, but as baseball great, Satchel Paige once famously declared, “Don’t look back, something might be gaining on you.”

A POWERFUL Birthday Gift to Moore’s Law

By Bradley McCredie

President, OpenPOWER Foundation

As we prepare to join the computing world in celebrating the 50th anniversary of Moore’s Law, we can’t help but notice how the aging process has slowed it down. In fact, in a recent interview with IEEE Spectrum, Moore said, “I guess I see Moore’s Law dying here in the next decade or so.”  But we have not come to bury Moore’s Law.  Quite the contrary, we need the economic advancements that are derived from the scaling Moore’s law describes to survive — and they will — if it adapts yet again to changing times.

It is clear, as the next generation of warehouse scale computing comes of age, sole reliance on the “tick tock” approach to microprocessor development is no longer viable.  As I told the participants at our first OpenPOWER Foundation summit last month in San Jose, the era of relying solely on the generation to generation improvements of the general-purpose processor is over.  The advancement of the general purpose processor is being outpaced by the disruptive and surging demands being placed on today’s infrastructure.  At the same time, the need for the cost/performance advancement and computational growth rates that Moore’s law used to deliver has never been greater.   OpenPOWER is a way to bridge that gap and keep Moore’s Law alive through customized processors, systems, accelerators, and software solutions.  At our San Jose summit, some of our more than 100 Foundation members, spanning 22 countries and six continents, unveiled the first of what we know will be a growing number of OpenPOWER solutions, developed collaboratively, and built upon the non-proprietary IBM POWER architecture. These solutions include:

Prototype of IBM’s first OpenPOWER high performance computing server on the path to exascale

  • First commercially available OpenPOWER server, the TYAN TN71-BP012
  • First GPU-accelerated OpenPOWER developer platform, the Cirrascale RM4950
  • Rackspace open server specification and motherboard mock-up combining OpenPOWER, Open Compute and OpenStack

Together, we are reimagining the data center, and our open innovation business model is leading historic transformation in our industry.

The OpenPOWER business model is built upon a foundation of a large ecosystem that drives innovations and shares the profits from those innovations. We are at a point in time where business model innovation is just as important to our industry as technology innovation.

You don’t have to look any further than OpenPOWER Chairman, Gordon MacKean’s company, Google to see an example of what I mean. While the technology that Google creates and uses is leading in our industry, Google would not be even be a shadow of the company it is today without it’s extremely innovative business model. Google gives away all of its advanced technology for free and monetizes it through other means.

In fact if you think about it, most all of the fastest growing “new companies” in our industry are built on innovative technology ideas, but the most successful ones are all leveraging business model innovations as well.

The early successes of the OpenPower approach confirm what we all know – to expedite innovation, we must move beyond a processor and technology-only design ecosystem to an ecosystem that takes into account system bottlenecks, system software, and most importantly, the benefits of an open, collaborative ecosystem.

This is about how organizations, companies and even countries can address disruptions and technology shifts to create a fundamentally new competitive approach.

No one company alone can spark the magnitude or diversity of the type of innovation we are going to need for the growing number of hyper-scale data centers. In short, we must collaborate not only to survive…we must collaborate to innovate, differentiate and thrive.

The OpenPOWER Foundation, our global team of rivals, is modeling what we IBMers like to call “co-opetition” – competing when it is in the best interest of our companies and cooperating with each other when it helps us all.  This combination of breakthrough technologies and unprecedented collaboration is putting us in the forefront of the next great wave of computing innovation.  Which takes us back to Moore’s Law.  In 1965, when Gordon Moore gave us a challenge and a roadmap to the future, there were no smartphones or laptops, and wide-scale enterprise computing was still a dream.  None of those technology breakthroughs would have been possible without the vision of one man who shared it with the world.  OpenPOWER is a bridge we share to a new era. Who knows what breakthroughs it will spawn in our increasingly technology-driven and connected world.  As Moore’s Law has shown us, the future is wide open.

Center of Accelerated Application Readiness: Preparing applications for Summit


The hybrid CPU-GPU architecture is one of the main tracks for dealing with the power limitations imposed on high performance computing systems. It is expected that large leadership computing facilities will, for the foreseeable future, deploy systems with this design to address science and engineering challenges for government, academia, and industry. Consistent with this trend, the U.S. Department of Energy’s (DOE) Oak Ridge Leadership Computing Facility (OLCF) has signed a contract with IBM to bring a next-generation supercomputer to the Oak Ridge National Laboratory (ORNL) in 2017. This new supercomputer, named Summit, will provide on science applications at least five times the performance of Titan, the OLCF’s current hybrid CPU+GPU leadership system, and become the next peak in leadership-class computing systems for open science. In response to a call for proposals, the OLCF has selected and will be partnering with science and engineering application development teams for the porting and optimization of their applications and carrying out a science campaign at scale on Summit.

Speaker Organization

National Center for Computational Sciences
Oak Ridge National Laboratory
Oak Ridge, TN, USA


Download Presentation

Back to Summit Details

Get Ready to Rethink the Data Center: Welcome to OpenPOWER Summit 2015!

By Gordon MacKean, OpenPOWER Chairman

We are here, we made it: the OpenPOWER Foundation’s inaugural Summit, “Rethink the Data Center,” starts tomorrow! I wanted to take this opportunity to welcome everyone that will be joining us for the OpenPOWER Summit taking place at NVIDIA’s GPU Technology Conference in San Jose. We’ve got an exciting few days planned for you. Our three-day event kicks off tomorrow morning and goes through Thursday afternoon.  The full schedule is available online at, but here’s a quick rundown of what you won’t want to miss:

  • All Show – Demos and Presentations in the OpenPOWER Pavilion on the GTC exhibit floor!  Join fellow OpenPOWER members to hear firsthand about their OpenPOWER-based solutions.
  • Wednesday – Morning Keynotes with myself and OpenPOWER President Brad McCredie where we’ll unveil just how quickly our hardware ecosystem is expanding.
  • Wednesday – Afternoon Member Presentations with several of our Foundation members.   You’ll hear from members such as Mellanox, Tyan, Altera, Rackspace, and Suzhou PowerCore about how they’re dialing up the volume on innovation.
  • Thursday – Hands-on Firmware Training Labs hosted by IBM for building, modifying and testing OpenPOWER firmware with expert guides.
  • Thursday – ISV Roundtable where we’ll discuss OpenPOWER at the software level, including presentations, lightning talks, open discussion and facilitated brainstorming.


With so much great content and a high level of engagement from our members, the OpenPOWER Summit is clearly the place for other interested parties to get involved and learn how they can join a global roster of innovators rethinking the data center.  Our work is far from over and, given our rapid membership growth, there is no slowdown in sight.  As of today, the Foundation is comprised of over 110 members across 22 countries.  Through eight charted Work Groups, and more on the way, we are providing the technical building blocks needed to equip our community to drive meaningful innovation.

So to those who are attending, welcome! I look forward to seeing you here in San Jose. And, for those who are unable to join us this week, I invite you to follow the conversation online with #OpenPOWERSummit – there will undoubtedly be lots to talk about this week!

Yours in collaboration,


Innov8 with POWER8 “Best in Show” Selected at InterConnect2015

By Terri Virnig, Vice President of Power Ecosystem and Strategy, IBM Systems

It is hard to believe that this semester’s Innov8 with POWER8 Challenge has come to a close. Over the course of the fall semester, we worked with three top universities – North Carolina State University, Oregon State University and Rice University – to provide computer science seniors and graduate students with Power Systems and the associated stack environment needed to work on two projects each. The students set out to tackle real-world business challenges with each of their projects, pushing the limits of what’s possible and gaining market-ready career skills and knowledge. To further support the collaboration, each of the universities were given an opportunity to work with industry leaders from the OpenPOWER Foundation, including Mellanox, NVIDIA and Altera.

This week at IBM InterConnect2015 in Las Vegas, students and their professors from each of the participating universities had the opportunity to showcase their projects at our InterConnect Solution Expo. The conference attracted more than 20,000 attendees and provided them with the opportunity to share the countless hours of research and hard work each of them have put into their projects.

Innov8 with POWER8

For those who have been following along over the past few months, you know that we’ve given our community of followers the opportunity to vote for their favorite project with our ‘Tweet your vote’ social voting platform. Before I share the winner of our “Best in Show” recognition, here’s a quick rundown of how the universities have taken advantage of the Power platform to work on truly innovative projects:

Innov8 with POWER8_NC State
North Carolina State University (NCSU)

  • NCSU’s projects addressed real-world bottlenecks in deploying big data solutions. NCSU built up a strong set of skills in big data, working with the Power team to push the boundaries in delivering what clients need, and these projects extended their work to the next level, by taking advantage of the accelerators that are a core element of the POWER8 value proposition.
  • Project #1 focused on big data optimization, accelerating the preprocessing phase of their big data pipeline with power-optimized, coherently attached reconfigurable accelerators in FPGAs from Altera. The team assessed the work from the IBM Zurich Research Laboratory on text analytics acceleration, aiming to eventually develop their own accelerators.
  • Project #2 focused on smart storage. The team worked on leveraging the Zurich accelerator in the storage context as well.

Innov8 with POWER8_OSU
Oregon State University (OSU)

  • OSU’s Open Source Lab has been a leader in open source cloud solutions on Power Systems, even providing a cloud solution hosting more than 160 projects. With their projects, OSU aimed to create strong Infrastructure as a Service (IaaS) offerings, leveraging the network strengths of Mellanox, as well as improving the management of the cloud solutions via a partnership with Chef.
  • Project #1 focused on cloud enablement, working to create an OpenPOWER stack environment to demonstrate Mellanox networking and cloud capabilities.
  • On the other end, for project #2, OSU took an open technology approach to cloud, using Linux, OpenStack and KVM to create a platform environment managed by Chef in the university’s Open Source Lab.

Innov8 with POWER8_Rice
Rice University

  • Rice University has recognized that genomics information consumes massive datasets and that developing the infrastructure required to rapidly ingest, perform analytics and store this information is a challenge. Rice’s initiatives, in collaboration with NVIDIA and Mellanox, were designed to accelerate the adoption of these new big data and analytics technologies in medical research and clinical practice.
  • Project #1 focused on exploiting the massive parallelism of GPU accelerator technology and linear programming algorithms to provide deeper understanding of basic organism biology, genetic variation and pathology.
  • For project #2, students developed new approaches to high-throughput systematic identification of chromatin loops between genomic regulatory elements, utilizing GPUs to in-parallel and efficiently search the space of possible chromatin interactions for true chromatin loops.

We are especially proud of the work that each and every one of the students has put into the Innov8 with POWER8 Challenge. As a result of social voting across our communities, it is our pleasure to announce that our 2015 Best in Show recognition goes to project “Genome Assembly in a Week” from Rice University! The team leader, Prof. Erez Aiden, and students, Sarah Nyquist and Chris Lui, were on hand at InterConnect2015 to receive their recognition at the Infrastructure Matters zone in the Solution Expo at the Mandalay Bay Convention Center on Wednesday

Innov8 with POWER8_2

Being able to experience the innovative thinking and enthusiasm from our university participants has been such a privilege. Throughout the semester, each of the universities truly made invaluable contributions in the IT space. Thank you to all who voted and stopped by during the conference! We invite you to stay tuned for more updates on these projects at Edge2015. You can follow the teams on our Tumblr page.


Crossing the Performance Chasm with OpenPOWER

Executive Summary

The increasing use of smart phones, sensors and social media is a reality across many industries today. It is not just where and how business is conducted that is changing, but the speed and scope of the business decision-making process is also transforming because of several emerging technologies – Cloud, High Performance Computing (HPC), Analytics, Social and Mobile (CHASM). 

High Performance Data Analytics (HPDA) is the fastest growing segment within HPC. Businesses are investing in HPDA to improve customer experience and loyalty, discover new revenue opportunities, detect fraud and breaches, optimize oil and gas exploration and production, improve patient outcomes, mitigate financial risks, and more. Likewise, HPDA helps governments respond faster to emergencies, analyze terrorist threats better and more accurately predict the weather – all of which are vital for national security, public safety and the environment. The economic and social value of HPDA is immense.

But the sheer volume, velocity and variety of data is an obstacle to cross the Performance Chasm in almost every industry.  To meet this challenge, organizations must deploy a cost-effective, high-performance, reliable and agile IT infrastructure to deliver the best possible business outcomes. This is the goal of IBM’s data-centric design of Power Systems and the OpenPOWER Foundation. 

A key underlying belief driving the OpenPOWER Foundation is that focusing solely on microprocessors is insufficient to help organizations cross this Performance Chasm. System stack (processors, memory, storage, networking, file systems, systems management, application development environments, accelerators, workload optimization, etc.) innovations are required to improve performance and cost/performance. IBM’s data-centric design minimizes data motion, enables compute capabilities across the system stack, provides a modular, scalable architecture and is optimized for HPDA.     

Real world examples of innovations and performance enhancements resulting from IBM’s data-centric design of Power Systems and the OpenPOWER Foundation are discussed. These span financial services, life sciences, oil and gas and other HPDA workloads. These examples highlight the urgent need for clients (and the industry) to evaluate HPC systems performance at the solution/workflow level rather than just on narrow synthetic point benchmarks such as LINPACK that have long dominated the industry’s discussion.    

Clients who invest in IBM Power Systems for HPC could lower the total cost of ownership (TCO) with fewer more reliable servers compared to x86 alternatives.  More importantly, these customers will also be able to cross the Performance Chasm leveraging high-value offerings delivered by the OpenPOWER Foundation for many real life HPC workloads.  


Sponsored by IBM
Srini Chari, Ph.D., MBA


Download Presentation

Back to Summit Details

DB2 BLU w/GPU Demo – Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs


In this technology preview demonstration, we will show the concurrent execution of an analytical workload on a POWER8 server with K40 GPUs. DB2 will detect both the presence of GPU cards in the server and the opportunity in queries to shift the processing of certain core operations to the GPU.  The required data will be copied into the GPU memory, the operation performed and the results sent back to the P8 processor for any remaining processing. The objective is to 1) reduce the elapsed time for the operation and 2) Make more CPU available to other SQL processing and increase overall system throughput by moving intensive CPU processing tasks to GPU

Speaker names / Titles

Sina Meraji, PhD
Hardware Acceleration Laboratory, SWG

Berni Schiefer
Technical Executive (aka DE) ,
Information Management Performance and Benchmarks
DB2, BigInsights / Big SQL, BlueMix SQLDB / Analytics Warehouse  and Optim Data Studio 


Download Presentation

Back to Summit Details

Deploying POWER8 Virtual Machines in OVH Public Cloud

By Carol B. Hernandez, Sr. Technical Staff Member, Power Systems Design

Deploying POWER8 virtual machines for your projects is straightforward and fast using OVH POWER8 cloud services. POWER8 virtual machines are available in two flavors in OVH’s RunAbove cloud:

image1 image2


POWER8 compute is offered in RunAbove as a “Lab”. Labs provide access to the latest technologies in the cloud and are not subject to Service Level Agreements (SLA). I signed up for the POWER8 lab and decided to share my experience and findings.

To get started, you have to open a RunAbove account and sign up for the POWER8 Lab at:

When you open a RunAbove account you link the account to a form of payment, credit card or pay pal account. I had trouble using the credit card path but was able to link to a pay pal account successfully.

After successfully signing up for the POWER8 lab, you are taken to the RunAbove home page which defaults to “Add an Instance”.


The process to create a POWER8 instance (aka virtual machine) is straightforward. You select the data center (North America BHS-1), the “instance flavor” (Power 8S), and the instance image (Ubuntu 14.04).


Then, you select the ssh key to access the virtual machine. The first time I created an instance, I had to add my ssh key. After that, I just had to select among the available ssh keys.

The last step is to enter the instance name and you are ready to “fire up”. The IBM POWER8 S flavor gives you a POWER8 virtual machine with 8 virtual processors, 4 GB of RAM, and 10 GB of object storage. The virtual machine is connected to the default external network. The Ubuntu 14.04 image is preloaded in the virtual machine.

After a couple of minutes, you get the IP address and can ssh to your POWER8 virtual machine.

image6 image13


You can log in to your POWER8 virtual machine and upgrade the Linux image to the latest release available, using the appropriate Linux distribution commands. I was able to successfully upgrade to Ubuntu 14.10.

The default RunAbove interface (Simple Mode) provides access to a limited set of tasks, e.g. add and remove instances, SSH keys, and object storage. The OpenStack Horizon interface, accessed through the drop down menu under the user name, provides access to an extended set of tasks and options.


Some of the capabilities available through the OpenStack Horizon interface are:

Create snapshots. This function is very helpful to capture custom images that can be used later on to create other virtual machines. I created a snapshot of the POWER8 virtual machine after upgrading the Linux image to Ubuntu 14.10.


Manage project images. You can add images to your project by creating snapshots of your virtual machines or importing an image using the Create Image task. The figure below shows a couple of snapshots of POWER8 virtual machines after the images were customized by upgrading to Ubuntu 10.14 or adding various packages for development purposes.


Add private network connections. You can create a local network and connect your virtual machines to it when you create an instance.


Create instance from snapshot. The launch instance task, provided in the OpenStack Horizon interface, allows you to create a virtual machine using a snapshot from the project image library. In this example, the snapshot of a virtual machine that was upgraded to Ubuntu 14.10 was selected.



Customize instance configuration. The launch instance task also allows you to add the virtual machine to a private network and specify post-deployment customization scripts, e.g. OpenStack user-data.


All of these capabilities are also available through OpenStack APIs. The figure below lists all the supported OpenStack services.


Billing is based on created instances. The hourly rate ($0.05/hr) is charged even if the instance is inactive or you never log in to instance. There is also a small charge for storing custom images or snapshots.

To summarize, you can quickly provision a POWER8 environment to meet your project needs using OVH RunAbove interfaces as follows:

  • Use “Add Instance” to create a POWER8 virtual machine. Customize the Linux image with the desired development environment / packages or workloads
    • Upgrade to desired OS level
    • Install any applications, packages or files needed to support your project
  • Create a snapshot of the POWER8 virtual machine with custom image
  • Use “Launch Instance” to create a POWER8 virtual machine using the snapshot of your custom image
    • For quick and consistent deployment of desired environment on multiple POWER8 virtual machines
  • Delete and re-deploy POWER8 virtual machines as needed to meet your project demands
  • Use OpenStack APIs to automate deployment of POWER8 Virtual Machines

For more information about the OVH POWER8 cloud services and to sign up for the POWER8 lab visit:

OpenPOWER Breaks Through 100 … A Signal of Even More Innovation to Come!

By: Gordon MacKean, OpenPOWER Chairman

Today the OpenPOWER Foundation hit a new milestone. We are now officially 101 members strong. But, we all know numbers in themselves are not what is significant, it is what they represent. For the OpenPOWER community, it signals to us that we’re on the right track and, with each new member that comes on board, our collaboration and resulting innovation multiplies. Speaking of numbers, here’s a few more that illustrate our progress …

  • 1 OpenPOWER started with one shared idea — to drive more innovation in the data center.
  • 5 Beginning with five founders — IBM, Google, NVIDIA, Mellanox and Tyan, the OpenPOWER Foundation has exponentially grown to now …
  • 101 OpenPOWER Foundation members around the world representing a diverse set of leaders from across the technology industry. From cloud service providers and technology consumers to chip designers, hardware components, system vendors and firmware and software providers and beyond, they’re all leveraging POWER’s open architecture to drive innovation. OpenPOWER’s membership is also geographically diverse, representing …
  • 22 countries across 6 continents with a membership roster spanning Asia, North America, South America, Australia, Africa and Europe. Note: No members from Antarctica. Yet. 😉
  • 8 To date the Foundation has chartered eight member Working Groups organized by technical focus areas of interest including interoperability, system software, memory, compliance, hardware architecture, application software, accelerators and the development of an open server development platform. The work being accomplished by these groups supports …
  • 35 confirmed member presentations detailing OpenPOWER products and projects underway that will be shared at the OpenPOWER Foundation’s debut conference, the OpenPOWER Summit, taking place at the San Jose Convention Center March 17-19. So hurry, there’s only ….
  • 28 days left until the Summit begins! Come and join us. Learn more and register today by going to

Looking forward to seeing you in San Jose! Now, let’s get back to collaborating and innovating.

OpenPOWER Breaks Through 100 … A Signal of Even More Innovation to Come!

1 OpenPOWER started with one shared idea — to drive more innovation in the data center.

5 Beginning with five founders — IBM, Google, NVIDIA, Mellanox and Tyan, the OpenPOWER Foundation has exponentially grown to now …

101 OpenPOWER Foundation members around the world representing a diverse set of leaders from across the technology industry.  From cloud service providers and technology consumers to chip designers, hardware components, system vendors and firmware and software providers and beyond, they’re all leveraging POWER’s open architecture to drive innovation. OpenPOWER’s membership is also geographically diverse, representing …

22 countries across 6 continents with a membership roster spanning Asia, North America, South America, Australia, Africa and Europe. Note: No members from Antarctica. Yet. 😉

8 To date the Foundation has chartered eight member Working Groups organized by technical focus areas of interest including interoperability, system software, memory, compliance, hardware architecture, application software, accelerators and the development of an open server development platform. The work being accomplished by these groups supports …

35 confirmed member presentations detailing OpenPOWER products and projects underway that will be shared at the OpenPOWER Foundation’s debut conference, the OpenPOWER Summit, taking place at the San Jose Convention Center March 17-19. So hurry, there’s only ….

28 days left until the Summit begins! Come and join us. Learn more and register today by going to

Looking forward to seeing you in San Jose! Now, let’s get back to collaborating and innovating.

Introducing OpenPOWER Developer Tools – A One Stop Resource for Porting and Building OpenPOWER Compatible Solutions

by Jeff Scheel, IBM Linux on Power Chief Engineer

Since its inception, the OpenPOWER Foundation has been dedicated to fostering a collaborative environment to drive meaningful hardware and software innovation.  As part of that commitment, OpenPOWER today launched a valuable new asset for developers:  OpenPOWER Developer Tools.

Available on the Technology Resources section of the OpenPOWER website, OpenPOWER Developer Tools is a quintessential starting place for developers looking to participate in OpenPOWER’s growing ecosystem.  The tool kits and other resources – made available by our members – include a variety of hardware, software and other technical resources that will enable developers to more quickly leverage POWER’s open architecture to build solutions that follow OpenPOWER’s design concept.

Current tools and technical assets available include: Tyan’s OpenPOWER customer reference system, Nallatech’s CAPI Developer Kit, a software developer toolkit for Linux on POWER, access to IBM’s Power Development Cloud and more.  And, with a membership over 90 members and counting, we expect to post additional tools on an ongoing basis … so check back often!

The Next Peak in HPC

National Center for Computational Sciences
Oak Ridge National Laboratory
Oak Ridge, TN, USA


Hybrid CPU+GPU architectures are a response to power limitations imposed by the end in the last decade of processor clock-rate scaling. This limitation continues to drive supercomputer architecture designs toward massively parallel, hierarchical, and/or hybrid systems, and we expect that, for the foreseeable future, large leadership computing systems will continue this trajectory in order to address science and engineering challenges for government, academia, and industry. Consistent with this trend, the U.S. Department of Energy’s (DOE) Oak Ridge Leadership Computing Facility (OLCF) has signed a contract with IBM to bring a next-generation supercomputer to the Oak Ridge National Laboratory (ORNL) in 2017. This new supercomputer, named Summit, will provide on science applications at least five times the performance of Titan, the OLCF’s current hybrid CPU+GPU leadership system, and become the next peak in leadership-class computing systems for open science. To deliver this new capability, IBM has formed a partnership with NVIDIA and Mellanox, all members of the OpenPOWER Foundation, and each will provide system components for Summit. In addition, OLCF will partner with eight application software development teams to jointly prepare their science applications for the Summit architecture, and carry out early science campaign to demonstrate the Summit’s new capabilities for science. These application-readiness partnerships, with support from the IBM/NVIDIA Center of Excellence at Oak Ridge, will exercise Summit’s programming models and harden its software tools. In order to meet DOE’s broad science and energy missions, DOE procurements continue to support diversity in architectures. And in this context, more mature programming environments, enabling performance portable software engineering, become a requirement for DOE supercomputing facilities. To prepare mission-critical scientific applications now and for the next generation systems, our center continues to advance open-standards and work closely with ecosystem partners to address needs of our users. These efforts will be outlined in this talk.


Tjerk Straatsma
Jim Rogers
Adam Simpson
Ashley Barker
Fernanda Foertter
Jack Wells

Speaker Bio

Jack C. Wells, Ph.D.
Director of Science
National Center for Computational Science
Oak Ridge National Laboratory

Jack Wells is the Director of Science for the National Center for Computational Sciences (NCCS) at Oak Ridge National Laboratory (ORNL). He is responsible for devising the strategy to ensure cost-effective, state-of-the-art scientific computing at the NCCS, which hosts the Department of Energy’s Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science national user facility, and Titan, currently the faster supercomputer in the United States.

In ORNL’s Computing and Computational Sciences Directorate, Wells has previously lead both the Computational Materials Sciences group in the Computer Science and Mathematics Division and the Nanomaterials Theory Institute in the Center for Nanophase Materials Sciences. During an off-site assignment from 2006 to 2008, he served as a legislative fellow for U.S. Senator Lamar Alexander of Tennessee, providing information about high-performance computing, energy technology, and science, technology, engineering, and mathematics education policy issues.

Wells began his ORNL career in 1990 for resident research on his Ph.D. in Physics from Vanderbilt University.  Following a three-year postdoctoral fellowship at the Harvard-Smithsonian Center for Astrophysics, he returned to ORNL as a staff scientist in 1997 as a Wigner fellow.  Jack is an accomplished practitioner of computational physics and has been sponsored in his research by the Department of Energy’s Office of Basic Energy Sciences.

Jack has authored or co-authored over 70 scientific papers and edited 1 book, spanning nanoscience, materials science and engineering, nuclear and atomic physics computational science, and applied mathematics.


Download Presentation

Back to Summit Details

Join Us for an Action Packed 2015 Summit

– > 30 presentations on wide field of OpenPOWER topics
– demo pavilion
– isv roundtable
– firmware training

Its going to be an action packed Summit, come join us.

Accelerated Photodynamic Cancer Therapy Planning with FullMonte on OpenPOWER


Photodynamic therapy (PDT) is a minimally-invasive cancer therapy which uses a light-activated drug (photosensitizer/PS). When the photosensitizer absorbs a photon, it excites tissue oxygen into a reactive state which causes very localized cell damage. The light field distribution inside the tissue is therefore one of the critical parameters determining the treatment’s safety and efficacy. While FDA-approved and used for superficial indications, PDT has yet to be widely adopted for interstitial use for larger tumours using light delivered by optical fibres due to a lack of simulation and planning optimization software. Because tissue at optical wavelengths has a very high scattering coefficient, extensive Monte Carlo modeling of light transport is required to simulate the light distribution for a given treatment plan. To enable PDT planning, we demonstrate here our “FullMonte” system which uses a CAPI-enabled FPGA to simulate light propagation 4x faster and 67x more power-efficiently than a highly-tuned multicore CPU implementation. With coherent low-latency access to host memory, we are not limited by the size of on-chip memory and are able to transfer results to and from the accelerator rapidly, which will be support our iterative planning flow. Potential advantages of interstitial PDT include less invasiveness and potential post-operative complications than surgery, better damage targeting and confinement than radiation therapy, and no systemic toxicity unlike chemotherapy. While attractive for developed markets for better outcomes, PDT is doubly attractive in emerging regions because it offers the possibility of a single-shot treatment with very low-cost and even portable equipment supported by remotely-provided computing services for planning.


Jeffrey Cassidy, MASc, PEng is a PhD candidate in Electrical and Computer Engineering at the
University of Toronto.
Lothar Lilge, PhD is a senior scientist at the Princess Margaret Cancer Centre and a professor of
Medical Biophysics at the University of Toronto.
Vaughn Betz, PhD is the NSERC-Altera Chair in Programmable Silicon at the University of Toronto.


The work is supported by the Canadian Institutes of Health Research, the Canadian Natural Sciences and Engineering Research Council, IBM, Altera, Bluespec, and the Southern Ontario Smart Computing Innovation Platform.


Download Presentation

Back to Summit Details

Data center and Cloud computing market landscape and challenges

Presentation Objective

In this talk, we will gain an understanding of Data center and Cloud computing market landscape and challenges, discuss technology challenges that limit scaling of cloud computing that is growing at an exponential pace and wrap up with insights into how FPGAs combined with general purpose processors are transforming next generation data centers with tremendous compute horsepower, low-latency and extreme power efficiency.


Data center workloads demand high computational capabilities, flexibility, power efficiency, and low cost. In the computing hierarchy, general purpose CPUs excel at Von Neumann (serial) processing, GPUs perform well on highly regular SIMD processing, whereas inherently parallel FPGAs excel on specialized workloads. Examples of specialized workloads: compute and network acceleration, video and data analytics, financial trading, storage, database and security.  High level programming languages such as OpenCL have created a common development environment for CPUs, GPUs and FPGAs. This has led to adoption of hybrid architectures and a Heterogeneous World. This talk showcases FPGA-based acceleration examples with CAPI attach through OpenPOWER collaboration and highlights performance, power and latency benefits.

Speaker Bio

Manoj Roge is Director of Wired & Data Center Solutions Planning at Xilinx. Manoj is responsible for product/roadmap strategy and driving technology collaborations with partners. Manoj has spent 21 years in semiconductor industry with past 10 years in FPGA industry. He has been in various engineering and marketing/business development roles with increasing responsibilities. Manoj has been instrumental in driving broad innovative solutions through his participation in multiple standards bodies and consortiums. He holds an MBA from Santa Clara University, MSEE from University of Texas, Arlington and BSEE from University of Bombay.


Download Presentation

Back to Summit Details

System Management Tool for OpenPOWER

Introduction to Authors

Song Yu: Male, IBM STG China, Development Manager
Li Guang Cheng: Male, IBM STG China, xCAT Senior Architect
Mao Qiu Yin: Male, Teamsun, Director
Hu Hai Chen: Male, Teamsun, Development Manager
Ma Yuan Liang: Male, Teamsun, System Department Manager
Chen Qing Hong: Male, Teamsun, Architect


OpenPOWER is a new generation platform. As a new system, the infrastructure level management is the most important requirement while the OpenPOWER machines are widely used in cloud area and non-cloud area.

In cloud area

The end user normally cares about the SaaS or PaaS but for the cloud admin, they must consider how to manage the OpenPOWER physical node to provide service. Quickly and automatically provision physical machines and adding physical nodes into Cloud to provide service are very important and basic requirements for cloud center.

At the same time, if the Cloud provider support HPC related service, they need consider provide physical compute resource to end user but not the virtual resource. How to self-service for physical node is a new challenge in public cloud.

In non-cloud area: A light-weight system management tool for OpenPOWER is also required. How to control the HW and how to integrate with existing Power or x86 cluster smoothly are the major challenges for the OpenPOWER system management tool.

Demonstrated Features

  1. HW Control – Remote power, remote console, hardware inventory, hardware vitals, energy management and so on
  2. Automatically Discovery – Automatically discover new OpenPOWER HW and add into management system
  3. Provisioning – Unattended OS and application deployment onto the OpenPOWER node
  4. Image Management – Clone image,generate image including applications from scratch
  5. KVM management – Provision KVM hypervisor and manage the VM lifecycle
  6. Docker management – Provision Docker on OpenPOWER node and manage the container lifecycle
  7. Multitenancy – Support user, role, tenant and policy management. Work with Keystone for the authentication management and integrate with OpenStack

Our experience

We will leverage xCAT as the backend and Horizon as the frontend of the OpenPOWER management tool. xCAT has supported OpenPOWER node management and enabled the Docker on OpenPOWER system.

Benefit: The OpenPOWER management tool is based on open source products. It can easily manage the OpenPOWER node and the OpenPOWER vendors can easily add their special HW and FW control functions into the tool as value-add. The whole solution also demonstrates a complete story that how we enable OpenPOWER nodes in a private or public cloud.


Download Presentation

Back to Summit Details

Performance evaluation methodology to the OpenPOWER user community to evaluate the performance using the advanced instrumentation capabilities available in the Power 8 Microprocessor

Speaker’s Bio

Satish Kumar Sadasivam is a Senior Performance Engineer and a Master Inventor at IBM STG responsible for Compiler and Hardware Performance analysis and optimization of IBM Power Processors and Compilers. He has 9+ years of experience in the area of Computer Architecture covering wide range of domains including Performance Analysis, Compiler Optimization, HPC, Competitive Analysis and Processor Validation. Currently he is responsible for delivering Performance Leadership for Power 8 Processor for emerging workloads. He also evaluates Competitors (Intel) Microrarchitecture design in detail and provide feedback to Power 9 Hardware design to address the next generation computing needs. He has filed more than 15 patents and achieved his 5th Invention Plateau and has several publications to his credit.


IBM Systems and Technology Group

Presentation Objective

The primary objective of this presentation is to provide a performance evaluation methodology to the OpenPower user community to evaluate the performance using the advanced instrumentation capabilities available in the Power 8 Microprocessor. And also to present a case study on how CPI stack cycle accounting model can be effectively used to evaluate the performance of SPEC 2006 workloads in various SMT modes.


This presentation has been split into two sections. In the first section of the presentation we will primarily cover the key Performance Instrumentation capabilities of the Power 8 Microprocessor and how it can be effectively utilized to understand and resolve the performance bottlenecks in the Code. This will cover in detail the CPI stack cycle accounting model of Power 8 microprocessor and how this is different from the previous Power 7 processor architecture. The improvements which went into the POWER 8 CPI Stack cycle accounting which help CPI cycle accounting very precise.

In the second section of the presentation we will cover the Single Core SMT performance analysis of the SPEC 2006 workloads on the POWER 8 microprocessor. We will also discuss a performance evaluation methodology used to evaluate the performance of SMT. We will describe in detail how the CPI stack building for various SMT levels will help us root cause the key performance bottlenecks in the code at the higher SMT levels and how this can be attributed effectively to the different units of the microprocessor.


Download Presentation

Back to Summit Details

China POWER Technology Alliance (CPTA)


The objective is to position China POWER Technology Alliance (CPTA) as a mechanism to help global OpenPOWER Foundation members engage with China organizations on POWER-based implementations in China.


OpenPOWER ecosystem has grown fast in China Market with 12 OPF members growth in 2014. China POWER Technology Alliance was established in Oct. 2014, led by China Ministry of Industry and Information Technology (MIIT), in order to accelerate the speed of China secured and trusted IT industry chain building, by leveraging OpenPOWER Technology. This presentation is for the purpose of linking up CPTA and OPF global members, to help global OPF member to use CPTA as a stepping stone to go into China market. This presentation will focus on explaining to the global OPF members WHY they should come to China, and above all, HOW to come to China, and WHAT support services CPTA will provide to the global OPF members. It’ll also create a clarity between CPTA and OPF in China, for OPF members to leverage CPTA as a (non-mandatory) on-ramp to China.


Zhu Ya Dong (to be confirmed), Chairman of PowerCore, China, Platinum Member of OpenPOWER Foundation

Back to Summit Details

On Chip Controller (OCC)


Demonstrate POWER processor and memory capabilities that can be exploited using open source OCC firmware.


The On Chip Controller (OCC) is a co-processor that is embedded directly on the main processor die. The OCC can be used to control the processor frequency, power consumption, and temperature in order to maximize performance and minimize energy usage. This presentation will include an overview of the power, thermal, and performance data that the OCC can access as well as the various control knobs, including adjusting the processor frequency and memory
bandwidth. Details about the OCC processor, firmware structure, loop timings, off-load engines, and bus accesses will be given along with descriptions of example algorithms, system settings, and potential future enhancements.


Todd Rosedahl. IBM Chief Power/Thermal/Energy Management Engineer on POWER. Todd has worked on power, thermal, and energy management for his entire 22yr career at IBM and has over 20 related patents. He led the firmware effort for the On Chip Controller (OCC) which recently was released as open source.


Download Presentation

Back to Summit Details

FPGA Acceleration in a Power8 Cloud


OpenStack is one of the popular software that people use to run a cloud. It managers hardware resources like memory, disks, X86 and POWER processors and then provide IaaS to users. Based on existing OpenStack, more kinds of hardware resource can also be managed by OpenStack and be provided to users, like GPU and FPGA. FPGA has been widely used for many kinds of applications, and POWER8 processor has integrated an innovated interface called CAPI (Coherent Accelerator Processor Interface) for direct connection between FPGA and POWER8 chip. CAPI not only provides low latency, high
bandwidth and cache coherent interconnection between user’s accelerator hardware and the application software, but also provides an easy programming capability for both accelerator hardware developers and software developers. Based on such features, we extend the OpenStack to make the cloud users can remotely use the POWER8 machine with FPGA acceleration.

Our work allows the cloud users uploading their accelerator design to an automatically compiling service, and then their accelerators will be automatically deployed into a customized OpenStack cloud with POWER8 machine and FPGA card. When the cloud users launch some virtual machines (VMs) in this cloud, their accelerators can be attached to their VMs so that inside these VMs, they can use their accelerators for their applications. Like the operating system images in cloud, the accelerators can also be shared or sold in the whole cloud so that one user’s accelerator can benefit other users.

By enabling CAPI in the cloud, our work lowers the threshold of using FPGA acceleration, encourages people using accelerators for their application and sharing accelerators to all cloud users. The CAPI and FPGA acceleration ecosystem also benefits from this way. A public cloud with our work is in testing. It is used by some students in university. Remote accessing to the cloud is enabled, so that live demo can be shown when in the


Fei Chen works for IBM China Research Lab in major of cloud and big data. He achieved his B.S. degree in Tsinghua University, China and got his Ph.D. degree in Institute of Computing Technology, Chinese Academy of Sciences in the year 2011. He worked on hardware design for many years, and now focuses on integrating heterogeneous computing resource into cloud.
Organization: IBM China Research Lab (CRL)


Download Presentation

Back to Summit Details

SuperVessel — OpenPOWER R&D cloud with operation and practice experience sharing


SuperVessel cloud ( is the cloud platform built on top of POWER/OpenPOWER architecture technologies. It aims to provide the open remote access for all the ecosystem developers and university students. We (IBM Research China, IBM System Technology Lab in China and partners) have built and launched this cloud for more than 3 months, and rapidly attracted the public users from more than 30 universities, including those from GCG and the United States.

The cloud was built on OpenStack and enabled

  • The latest infrastructure as services, including PowerKVM, containers and docker services with big endian and little endian options.
  • The big data service through the collaboration with IBM big data technology for Hadoop 1.0 and open source technology for Hadoop 2.0 (SPARK service)
  • The IoT (Internet-of Things) application platform service which has successfully incubated several projects in Healthcare, smart city etc. areas.
  • The Accelerator as service (FPGA virtualization) with the novel marketplace, through the collaboration with Altera.

In this presentation, we would like to share how we built the cloud IaaS and PaaS with the open technologies on OpenPOWER. We also would share what will be the difference when you built a cloud for POWER vs. x86. The most important is the operational experience sharing (with data) for the cloud services on POWER/OpenPOWER.

Objective for the presentation

  1. With our real story on SuperVessel cloud, we want to give industry the real and strong confidence that OpenPOWER could be easily used for cloud, mobile and analysis.
  2. With our real experience, we want to tell industry how to build the cloud and big data services with OpenPOWER
  3. To encourage industry ecosystem to also easily build their cloud to attract more and more developers for OpenPOWER (it will be very important for OpenPOWER’s success)
  4. To encourage our partners and developers, they could leverage SuperVessel to speed up their R&D work on OpenPOWER. SuperVessel is open for them for use and collaboration.

Speaker Bio

Speaker Name: Yonghua Lin (, IBM Research China
Yonghua Lin is the Senior Technical Staff Member and Senior Manager of Cloud Infrastructure group in IBM Research. She has worked on system architecture research in IBM for 12 years. Her work covered all kinds of IBM multicore processors in the past 10 years, including IBM network processor, IBM Cell processor, PRISM, IBM POWER 6, and POWER 7, etc. She was the initiator of mobile infrastructure on cloud from 2007 which has become the Network Function Virtualization today. She led IBM team built up the FIRST optimized cloud for 4G mobile infrastructures, and successfully demonstrated in ITU, Mobile World Congress, etc. She was the founder of SuperVessel cloud to support OpenPOWER research and development in industry. She herself has more than 40 patents granted worldwide and publications in top conferences and journals.

Back to Summit Details

HPC solution stack on OpenPOWER

Introduction to Authors

Bin Xu: Male, IBM STG China, advisory software engineer, PCM architect, mainly focus on High Performance Computing and Software Define environment.

Jing Li: Male, IBM STG China, development manager for PCM/PHPC.


OpenPOWER will be one of major platforms used in all kinds of industry area, especially in High Performance Computing (HPC). IBM Platform Cluster Manager (PCM) is the most popular cluster management software aiming to simplify the system and workload management in data center.


As a brand new platform based on IBM POWER technology, the customer is asking if their end to end applications even total solutions can be well running on OpenPOWER.

Our experience: This demo will show the capability of IBM OpenPOWER that can be the foundation of the complicated High Performance Computing complete solution. From the HPC cluster deployment, job scheduling, system management, application management to the science computing workloads on top of them, all these components can be well constructed on top of IBM OpenPOWER platform with good usability and performance. Also this demo shows the simplicity of migrating a complete x86 based HPC stack to the OpenPOWER platform.  In this demo, the Platform Cluster Manager (PCM) and xCat will serve as the deployment and management facilitators of the solution, the Platform HPC will be the total solution integrated with Platform LSF (Load Sharing Facility), Platform MPI, and other HPC related middleware, and two of the popular HPC applications will be demonstrated on this stack.


There are three steps in above:

  • Admin installs the head node.
  • Admin discovery other nodes and provision them to join the HPC cluster automatically.
  • User runs their HPC application and monitoring the cluster on dashboard.


Faster and easy to deploy the HPC cluster environment based on OpenPOWER technology, and provide the system management, workload management with great usability for OpenPOWER HPC.

Next Steps and Recommendations

Integration with other application in OpenPOWER environment


Download Presentation

Back to Summit Details

Power and Speed: Maximizing Application Performance on IBM Power Systems with XL C/C++ Compiler

Presentation Objective

How to optimize your application to fully exploit the functionality of your POWER system.


This presentation will provide the latest news on IBM’s compilers on Power. The major features to enhance portability such as improved standards compliance and gcc compiler source code and option compatibility will be presented. The presentation will also cover performance tuning and compiler optimization tips to maximize workload performance on IBM Power Systems including exploitation of the POWER8 processor and architecture.


Yaoqing Gao is a Senior Technical Staff Member at IBM Canada Lab in the compiler development area. His major interests are compilation technology, optimization and performance tuning tools, parallel programming models and languages, and computer architecture. He has been doing research and development for IBM XL C/C++ and Fortran compiler products on IBM POWER, System z, CELL processors and Blue Gene.   He authored over 30 papers in journals and conferences.  He has been an IBM Master inventor since 2006 and authored over 30 issued and pending patents.




Download Presentation

Back to Summit Details

XL C/C++ and GPU Programming on Power Systems

Presentation Objective

Provide information on the integration the nVidia Tesla GPU with IBM’s POWER8 processor and details on how to develop on this platform using nVidia’s software stack and the POWER8 compilers.


The OpenPOWER foundation is an organization with a mandate to enable member companies to customize the POWER CPU processors and system platforms for optimization and innovation for their business needs. One such customization is the integration of graphics processing unit (GPU) technology with the POWER processor. IBM has recently announced the IBM POWER S824L system, a data processing powerhouse that integrates the nVidia Tesla GPU with IBM’s POWER8 processor. This joint presentation with nVidia and IBM will contain details of the S824L System, including an overview of the Tesla GPU and how it interoperates with the POWER8 processor. It will also describe the nVidia software stack and how it works with the POWER8 compilers.

Speaker Bio:

Kelvin Li is an Advisory Software Developer at IBM Canada Lab in the compiler development area.  He has experiences in Fortran, C and C++ compiler development.  His interest is in parallel programming models and languages.  He is the IBM representative in OpenMP Architecture Review Board, a member of the language committee and the chair of the Fortran subcommittee.


Download Presentation


Download Presentation

Back to Summit Details

POWER8 — the first OpenPOWER processor


The POWER8 processor is the latest RISC (Reduced Instruction Set Computer) microprocessor from IBM and the first processor supporting the new OpenPOWER software environment.   Power8 was designed to deliver unprecedented performance for emerging workloads, such as Business Analytics and Big Data applications, Cloud computing and Scale out Datacenter workloads.  It is fabricated using IBM’s 22-nm Silicon on Insulator (SOI) technology with layers of metal, and it has been designed to significantly improve both single-thread performance and single-core throughput over its predecessor, the POWER7i processor. The rate of increase in processor frequency enabled by new silicon technology advancements has decreased dramatically in recent generations,  as compared to the historic trend. This has caused many processor designs in the industry to show very little improvement in either single-thread or single-core performance, and, instead, larger numbers of cores are primarily pursued in each generation. Going against this industry trend, the POWER8 processor relies on a much improved core and nest microarchitecture to achieve approximately one-and-a-half times the single-thread performance and twice the single-core throughput of the POWER7 processor in several commercial applications. Combined with a 50% increase in the number of cores (from 8 in the POWER7 processor to 12 in the POWER8 processor), the result is a processor that leads the industry in performance for enterprise workloads. This talk will describe the architecture and microarchitecture innovations made in the POWER8 processor that resulted in these significant performance benefits for cloud applications, workload optimization features for stream processing, analytics and big data workloads, and support for organic workload growth.  Finally, this talk will introduce the CAPI accelerator interface that offers system architects a way to accelerate their workloads with custom accelerators seamlessly integrating with the Power system architecture.

Michael Gschwind, PhD
STSM & Manager, System Architecture, IBM Systems & Technology Group
Fellow, IEEE – Member, IBM Academy of Technology – IBM Master Inventor


Download Presentation

Back to Summit Details

Data Centric Interactive Visualization of Very Large Data

Speakers:  Bruce D’Amora and Gordon Fossum
Organization: IBM T.J. Watson Research, Data Centric Systems Group


The traditional workflow for high-performance computing simulation and analytics is to prepare the input data set, run a simulation, and visualize the results as a post-processing step. This process generally requires multiple computer systems designed for accelerating simulation and visualization. In the medical imaging and seismic domains, the data to be visualized typically comprise uniform three-dimensional arrays that can approach tens of petabytes. Transferring this data from one system to another can be daunting and in some cases may violate privacy, security, and export constraints.  Visually exploring these very large data sets consumes significant system resources and time that can be conserved if the simulation and visualization can reside on the same system to avoid time-consuming data transfer between systems. End-to-end workflow time can be reduced if the simulation and visualization can be performed simultaneously with a fast and efficient transfer of simulation output to visualization input.

Data centric visualization provides a platform architecture where the same high-performance server system can execute simulation, analytics and visualization.  We present a visualization framework for interactively exploring very large data sets using both isoparametric point extraction and direct volume-rendering techniques.  Our design and implementation leverages high performance IBM Power servers enabled with  NVIDIA GPU accelerators and flash-based high bandwidth low-latency memory. GPUs can accelerate generation and compression of two-dimensional images that can be transferred across a network to a range of devices including large display walls, workstation/PC, and smart devices. Users are able to remotely steer visualization, simulation, and analytics applications from a range of end-user devices including common smart devices such as phones and tablets. In this presentation, we discuss and demonstrate an early implementation and additional challenges for future work.

Speaker Bios

Bruce D’Amora, IBM Research Division, Thomas J. Watson Research Center, P.O Box 218, Yorktown Heights, New York 10598 ( . Mr. D’Amora is a Senior Technical Staff Member in the Computational Sciences department in Data-centric Computing group.  He is currently focusing on frameworks to enable computational steering and visualization for high performance computing applications.  Previously, Mr. D’Amora was the chief architect of Cell Broadband Engine-based platforms to accelerate applications used for creating digital animation and visual effects. He has been a lead developer on many projects ranging from applications to microprocessors and holds a number of hardware and software patents. He joined IBM Research in 2000 after serving as the Chief Software Architect for the IBM Graphics development group in Austin, Texas where he led the OpenGL development effort from 1991 to 2000. He holds Bachelor’s degrees in Microbiology and Applied Mathematics from the University of Colorado. He also holds a Masters degree in Computer Science from National Technological University.

Gordon C. Fossum IBM Research Division, Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York 10598 (  Mr. Fossum is an Advisory Engineer in Computational Sciences at the Thomas J. Watson Research Center. He received a B.S. degree in Mathematics and Computer Science from the University of Illinois in 1978, an M.S. in Computer Science from the University of California, Berkeley in 1981, and attained “all but dissertation” status from the University of Texas in 1987.  He subsequently joined IBM Austin, where he has worked on computer graphics hardware development, Cell Broadband Engine development, and OpenCL development. He is an author or coauthor of 34 patents, has received a “high value patent” award from IBM and was named an IBM Master Inventor in 2005. In January 2014, he transferred into IBM Research, to help enable visualization of “big data” in a data-centric computing environment.


Download Presentation

Back to Summit Details

Key-Value Store Acceleration with OpenPOWER


  • To show-case a broadly relevant data center application on FPGAs and OpenPOWER and the benefits it can bring
  • To demonstrate the advantages that OpenPOWER’s shared virtual memory concept offers
  • To entice partner companies to develop infrastructure and more sophisticated designs on top of our FPGA-based accelerator card


Distributed key-value stores such as memcached form a critical middleware application within today’s web infrastructure. However, typical x86-based systems yield limited performance scalability and high power consumption as their architecture with its optimization for single thread performance is not well-matched towards the memory-intensive and parallel nature of this application. In this talk, we present the architecture of an accelerated key-value store appliance that leverages a novel data-flow implementation of memcached on a Field Programmable Gate Array (FPGA) to achieve up to 36x in performance/power at response times in the microsecond range, as well as the coherent integration of memory through IBM’s OpenPOWER architecture, utilizing host memory and CAPI-attached flash as value store. This allows for economic scaling of value store density to terabytes while providing an open platform that can be augmented with additional functionality such as data analytics that can be easily partitioned between Power8 processor and FPGA.


Michaela Blott graduated from the University of Kaiserslautern in Germany. She worked in both research institutions (ETH and Bell Labs) as well as development organizations and was deeply involved in large scale international collaborations such as NetFPGA-10G. Her expertise spreads high-speed networking, emerging memory technologies, data centers and distributed computing systems with a focus on FPGA-based implementations. Today, she works as a principal engineer at the Xilinx labs in Dublin heading a team of international researchers. Her key responsibility is exploring applications, system architectures and new design flows for FPGAs in data centers.


Download Presentation

Back to Summit Details

Tyan OpenPOWER products and future product plans

Presentation Objectives

Invited to participate in OpenPOWER Foundation, TYAN developed the OpenPOWER reference board following the spirit of innovation and collaboration that defines the OpenPOWER architecture. In addition, TYAN contribute the associate reference design to the community. In the presentation, TYAN would like to share our value proposition to the community and reveal future product plan and associate milestones to the audiences participating in the first OpenPOWER Summit 2015.


Introduce TYAN and brief on what contribution has been made to OpenPOWER community in the past twelve months. TYAN will also share the future product plan and associate milestones to the audiences.

Speaker Bio

Albert Mu is Vice President at MiTAC Computing Technology Corporation and General Manager for Tyan Business Unit. From 2005 to 2008 he was with Intel as General Manager of Global Server Innovation Group (GSIG) with the charter to develop differentiated system platform products for Internet Portal Data Center and Cloud segments. Prior to Intel, Albert Mu was Vice President and General Manager of Network, Storage, and Server Group (NSSG) at Promise Technologies, Inc. and Corporate Vice President and Chief Technology Officer at Wistron Corporation.  Prior to Wistron, he was Vice President of Engineering at Clarent Corporation and worked at CISCO, HaL Computer and MIPS Computer. Mr. Mu received a BSEE degree from National Chiao Tung University, Hsinchu, Taiwan, MSEE degree from the University of Texas, at Austin and MS Engineering Management from Stanford University.


Download Presentation

Back to Summit Details

Trusted Computing Applied in OpenPOWER Linux

Introduction to Authors

Mao Qiu Yin: Male, Teamsun, Director
Zhiqiang Tian: Male, Teamsun, SW Developer


The computer system security problem is more and more emphasized by the Chinese government and it has created its own security standards. OpenPOWER as a new open platform, it urgently needs to achieve China’s trusted computing security standard and provides the prototype system that conforms to the specifications in order to satisfy the demands of the development of OpenPOWER ecosystem in China.

Demonstrated Features

  1. Trusted motherboard: As the RTM of the Trusted computing, provides the highest security solution.
  2. TPCM card: As a PCIE device, implements TCM and no HW change in system.
  3. Support TPCM driver in Power OS.
  4. Based on the white list and trusted database to implement Trusted Computing in OS kernel.
  5. Implemented trusted chain pass from RTM to application
  6. Support TPCM card in open power firmware level to support open power virtualization
  7. Apply the open power trusted computer node to China security Cloud system

Our experience

We choose Power Linux as the application OS and it is easy to port the whole trusted computing software stack to other UNIX like OS such as Power AIX.


The prototype implementation on the open power system that abides by the security standards of China provides strong support for the comprehensive power system promotion and in the meantime it provides a powerful guarantee for the development of power ecosystem in China high security level market. It enriches the China ISV and IHV’s options range with this total solution from hardware to software.


Download Presentation

Back to Summit Details

Reflections on Migrating IBM APP Genomic Workflow Acceleration to IBM POWER8

Author: Chandler Wilkerson, Rice University


To describe the challenges and lessons learned while installing the IBM Power Ready Platform for Genomic Workflow Acceleration on new IBM POWER8 hardware.


Migrating any workflow to a new hardware platform generates challenges and requires adaptability. With the transition from POWER7 to POWER8, the addition of PowerKVM obviates the need for VIOS and provides the opportunity to manage virtual machines on the POWER platform in a much more Linux-friendly manner. In addition, a number of changes to Red Hat’s Enterprise Linux operating system between versions 6 and 7 (7 being required for full POWER8 support at the time of this project’s start) have required modifying the standard processes outlined in the tested IBM solution. This presentation will take attendees through the growing pains and lessons learned while migrating a complex system to a new platform.


Chandler has taken the lead on all IBM POWER related projects within Rice’s Research Computing Support Group since 2008, including a pre-GA deployment of POWER7 servers that turned into a 48-node cluster, Blue BioU. The RCSG team maintains a collection of different HPC resources purchased through various grants, and is experienced in providing as uniform a user experience between platforms as possible.


Download Presentation

Back to Summit Details

Life at the Intersection: OpenPOWER, Open Compute, and the Future of Cloud Software & Infrastructure


  1. Provide Rackspace’s a point of view about what “the Cloud” needs from OpenPOWER, OCP, and developers in major software initiatives (Open Stack, Linux, Hypervisors, etc).
  2. Share observations about working cross functionally amongst development communities, especially ones that develop as-a-Service platforms. How best to engage?  Common mistakes.  Success stories.  What’s the give and take?
  3. Share what Rackspace (as a case study) plans to achieve now, and over the next few years, with OpenPOWER and Open Compute.


Open hardware has the potential to disrupt the datacenter and the world of software development in very positive ways.  OpenPOWER takes that potential a few steps further, both in the core system, and with technologies like CAPI.  These innovations raise the possibility of performance and efficiency improvements to a magnitude not seen for a long time.

The potential is there, but how do we drive adoption?  From platform developers?  From software developers?  From communities like OpenStack?  From service providers?  From end users?  And if we’re going to do it in the Open, that brings both big opportunities, and big challenges.  How do we manage that?

This talk will explore past experience and current impressions of someone who has done development work at the intersection of OpenStack and Open Compute for a few years.  It will cover his experience working with teams building & integrating hardware and software, for large scale as-a-Service deployments of OpenStack Nova and Ironic on Open Compute hardware.

It will also cover his take on the state of open hardware and software development today, and future frontiers.  He’ll present his thoughts and experiences getting as-a-Service developers to move further down the hardware stack, enabling the use of OpenPOWER features and technologies for the masses.


Aaron Sullivan is a Senior Director and Distinguished Engineer at Rackspace, focused on infrastructure strategy. Aaron joined Rackspace’s Product Development organization in late 2008, in an engineering role, focused on servers, storage, and operating systems. He moved to Rackspace’s Supply Chain/Business Operations organization in 2010, mostly focused on next generation storage and datacenters. He became a Principal Engineer during 2011 and a Director in 2012, supporting a variety of initiatives, including the development and launch of Rackspace’s first Open Compute platforms. He became a Senior Director and Distinguished Engineer in 2014.

These days, he spends most of his time working on next generation server technology, designing infrastructure for Rackspace’s Product and Practice Areas, and supporting the growth and capabilities of Rackspace’s Global Infrastructure Engineering team. He also frequently represents Rackspace as a public speaker, writer, and commentator. He was involved with the Open Compute Project (OCP) since its start at Rackspace. He became formally involved in late 2012. He is Rackspace’s lead for OCP initiatives and platform designs. Aaron is serving his second term as an OCP Incubation Committee member, and sponsors the Certification & Interoperability (C&I) project workgroup. He supported the C&I workgroup as they built and submitted their first test specifications. He has also spent time working with the OCP Foundation on licensing and other strategic initiatives.

Aaron previously spent time at GE, SBC, and AT&T. Over the last 17 years, he’s touched more technology than he cares to talk about. When he’s not working, he enjoys reading science and history, spending time with his wife and children, and a little solitude.


Download Presentation

Back to Summit Details

Linking up CPTA and OPF global members, to help global OPF member to use CPTA as a stepping stone to go into China market


Mr. Zhu Ya Dong, Chairman of PowerCore, China, Platinum Member of OpenPOWER Foundation


The objective is to position China POWER Technology Alliance (CPTA) as a mechanism to help global OpenPOWER Foundation members engage with China organizations on POWER-based implementations in China.


OpenPOWER ecosystem has grown fast in China Market with 12 OPF members growth in 2014. China POWER Technology Alliance was established in Oct. 2014, led by China Ministry of Industry and Information Technology (MIIT), in order to accelerate the speed of China secured and trusted IT industry chain building, by leveraging OpenPOWER Technology. This presentation is for the purpose of linking up CPTA and OPF global members, to help global OPF member to use CPTA as a stepping stone to go into China market. This presentation will focus on explaining to the global OPF members WHY they should come to China, and above all, HOW to come to China, and WHAT support services CPTA will provide to the global OPF members. It’ll also create a clarity between CPTA and OPF in China, for OPF members to leverage CPTA as a (non-mandatory) on-ramp to China.


Download Presentation

Back to Summit Details

Using NVM Express SSDs and CAPI to Accelerate Data-Center Applications in OpenPOWER Systems


PMC-Sierra, OpenPOWER Silver Member


The objective of this presentation is to showcase how NVM Express and CAPI can be used together to enable very high performance application acceleration in Power8 based servers. We target applications that are of interest to large data-center/hyper-scale customers such as Hadoop/Hive (map-reduce) and NoSQL (e.g. Redis) databases. The talk will discuss aspects of NVM Express, CAPI and the hyper-threading capabilities of the Power9 processor.


NVM Express is a standards based method of communication with PCIe attached Non-Volatile Memory. An NVM Express open-source driver has been an integrated part of the Linux kernel since March 2012 (version 3.3) and allows for very high performance. Currently there are NVM Express SSDs on the market that can achieve read speeds of over 3GB/s. A simple block diagram of the configuration. A PCIe NVM Express SSD and a CAPI accelerator card are connected to a Power8 CPU inside a Power8 server. We present results for a platform consisting of an NVM Express SSD, a CAPI accelerator card and a software stack running on a Power8 system. We show how the threading of the Power8 CPU can be used to move data from the SSD to the CAPI card at very high speeds and implement accelerator functions inside the CAPI card that can process the data at these speeds. We discuss several applications that can be serviced using this combination of NVMe SSD, CAPI and Power8.


Stephen Bates is a Technical Director at PMC-Sierra, Inc. He directs PMC’s Non-Volatile Memory characterization program and is an architect for PMC’s Flashtec™ family of SSD controllers. Prior to PMC he taught at the University of Alberta, Canada. Before that he worked as a DSP and ECC. He has a PhD from the University of Edinburgh and is a Senior Member of the IEEE.


Download Presentation

Back to Summit Details

PGI compilers for OpenPOWER platforms, which will enable seamless migration of multi-core and GPU-enabled HPC applications from Linux/x86 to OpenPOWER

Presentation objective

PGI Fortran, C and C++ compilers & tools are used on Linux/x86 processor-based systems at over 5000 high-performance computing (HPC) sites around the world.  They are distinguished by HPC-focused optimizations including automatic SIMD vectorization, and extensive support for parallel and accelerator programming models and languages including OpenMP, OpenACC and CUDA.  The objective of this talk is to give an overview of the forthcomingPGI compilers for OpenPOWER platforms, which will enable seamless migration of multi-core and GPU-enabled HPC applications from Linux/x86 to OpenPOWER and performance portability of HPC applications.


High-performance computing (HPC) systems are now built around a de facto node architecture with high-speed latency-optimized SIMD-capable CPUs coupled to massively parallel bandwidth-optimized Accelerators.  In recent years, as many as 90% of the Top 500 Computing systems relied entirely on x86 CPU-based systems.   OpenPOWER and the increasing success of Accelerator-based systems offer an alternative that promises unrivalled multi-core CPU performance and closer coupling of CPUs and GPUs through technologies like NVIDIA’s NVLink high-speed interconnect.  PGIFortran/C/C++ compilers, until now available exclusively on x86 CPU-based systems, are distinguished by a focus on HPC features and optimizations such as automatic SIMD vectorization and support for high-level parallel and GPU programming. This talk will give an overview of the forthcoming PGI compilers for POWER+GPU processor-based systems, including features for seamless migration of applications from Linux/x86 to Linux/POWER and performance portability across all mainstream HPC systems.


Doug Miles, director, PGI Compilers & Tools, since 2003.  Prior to joining PGI in 1993, Doug was an applications engineer at Cray Research Superservers and Floating Point Systems. He can be reached by e-mail at


Download Presentation

Back to Summit Details

NVIDIA Tesla Accelerated Computing Platform for IBM Power


Learn how applications can be accelerated on IBM Power8 systems with NVIDIA® Tesla® Accelerated Computing Platform, the leading platform for accelerating big data analytics and scientific computing. The platform combines the world’s fastest GPU accelerators, the widely used CUDA® parallel computing model, NVLink, high-speed GPU interconnect to power supercomputers, and a comprehensive ecosystem of software developers, software vendors, and datacenter system OEMs to accelerate discovery and insight.


Download Presentation

Back to Summit Details

Enabling Coherent FPGA Acceleration

Speaker: Allan Cantle – President & Founder, Nallatech
Speaker Organization: ISI / Nallatech

Presentation Objective

To introduce the audience to IBM’s Coherent Attached Processor Interface, CAPI, Hardware Development Kit, HDK, that is provided by Nallatech and provide an overview of FPGA Acceleration.


Heterogeneous Computing and the use of accelerators is becoming a generally accepted method of delivering efficient application acceleration. However, to date, there has been a lack of coordinated efforts to establish open industry standard methods for attaching and communicating between host processors and the various accelerators that are available today. With IBM’s OpenPOWER Foundation initiative, we now have the opportunity to effectively address this issue and dramatically improve the use and adoption of Accelerators.

The presentation will introduce CAPI, Coherent Accelerator Processor Interface, to the audience and will detail the CAPI HDK, Hardware Development Kit, implementation that is offered to OpenPOWER customers through Nallatech. Several high level examples will be presented that show where FPGA acceleration brings significant performance gains and how these can often be further advantaged by the Coherent CAPI interface. Programming methodologies of the accelerator will also be explored where customers can either leverage pre-compiled accelerated libraries that run on the accelerator or where they can write their own Accelerated functions in OpenCL.

Speaker Bio

Allan is the founder of Nallatech, established in 1993, that specializes in compute acceleration using FPGAs. As CEO, Allan focused Nallatech on helping customer’s port critical codes to Nallatech’s range of FPGA accelerators and pioneered several early tools that increased porting productivity. His prior background, with BAE Systems, was heavily involved in architecting Real Time, Heterogeneous Computers that tested live weapon systems and contained many parallel processors including Microprocessors, DSPs and FPGAs. Allan holds a 1st Class Honors EE BEng Degree from Plymouth University and a MSC in Corporate Leadership from Napier University.


Download Presentation

Back to Summit Details

The Future of Interconnect with OpenPOWER


Mellanox Technologies is a founding member of the OpenPOWER Foundation and is also the foundation for scalable and performance demanding computing infrastructures. Delivering 100Gb/s throughput, sub 700ns application to application latency and message rates of 150 million messages per second, Mellanox is recognized as the world leading interconnect solution provider. Along with proven performance, scalability, application offloads and management capabilities, Mellanox EDR 100G solutions were selected by the DOE for CORAL (Collaboration of Oak Ridge, Argonne and Lawrence Livermore National Labs), a project launched to meet the US Department of Energy’s (DOE) 2017-2018 leadership goals of competitiveness in science and ensures US economic and national security.

Mellanox ConnectX-4 EDR 100Gb/s technology was introduced in November at the SC’14 conference in New Orleans, LA. ConnectX-4 EDR 100Gb/s with CAPI support tightly integrates with the POWER CPU at the local bus level and provides faster access between the POWER CPU and the network device. We will discuss the latest interconnect advancements that maximize application performance and scalability on OpenPOWER architecture, including enhanced flexible connectivity with the latest Mellanox ConnectX-3 Pro Programmable Network Adapter. The new programmable adapter provides maximum flexibility for users to bring their own customized applications such as IPSEC encryption, enhanced flow steering and Network Address Translation (NAT), data inspection, data compression and others.

Speaker Bio

Speaker: Scot Schultz
Title: Director, HPC / Technical Computing

Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies.  Joining Mellanox in early 2013 as Director of HPC and Technical Computing, Schultz is 25-year veteran of the computing industry where prior to joining Mellanox, spent 17 years at AMD in various engineering and leadership roles; including strategic HPC technology ecosystem enablement.   Scot has been instrumental with the growth and development of numerous industry standards-based organizations including OpenPOWER, OpenFabrics Alliance, HPC Advisory Council and many others.


Download Presentation

Back to Summit Details

One-click Hadoop cluster deployment on OpenPower systems running KVM and managed by Openstack

Hadoop workloads are memory and compute intensive and Power servers are best choice for hadoop workloads.  The Power servers are first processor designed to accelerate big data workloads.

We implemented PowerKVM based Hadoop cluster solution on Power Systems and validated performance of teradata workload on PowerKVM virtual machines, to ensure consolidation of Hadoop workloads on  PowerKVM. This paper covers how capabilities of Open Power & Openstack simplify deployment of Hadoop Solution on Power Virtual machines. Also would like to share VM & Hadoop
cluster configuration which yields better performance

This presentation talks about “One-click hadoop cluster deployment on OpenPower systems running KVM and managed by Openstack”

Pradeep K Surisetty  普拉迪普库马
Linux KVM (PowerKVM & zKVM) Test Lead
Linux Technology Centre,


Download Presentation

Back to Summit Details

Porting Scientific Applications to OpenPOWER

Speaker and co-authors

Dirk Pleiter (Jülich Supercomputing Centre)
Andrew Adinets (JSC), Hans Böttiger (IBM), Paul Baumeister (JSC), Thorsten Hater (JSC), Uwe Fischer (IBM)


While over the past years significant experience for using GPUs with processors based on the x86 ISA has been obtained, GPU-accelerated systems with POWER processors have become available only very recently. In this talk we report on early experiences of porting selected scientific applications to GPU-accelerated POWER8 systems. We will explore basic performance features through micro-benchmarks, but our main focus will be on results for full applications or mini-applications. These have been selected such that hardware characteristics can be explored for applications with significantly different performance signatures. The application domains range from physics to life sciences and have in common that they are in need of supercomputing resources. Particular attention will be given to performance analysis capabilities of the system and the available software tools. We finally will report on a newly established POWER Acceleration and Design Center, which has the goal of providing support to scientists in using OpenPOWER technologies.

Speaker’s bio

Prof. Dr. Dirk Pleiter is research group leader at the Jülich Supercomputing Centre (JSC) and professor of theoretical physics at the University of Regensburg. At JSC he is leading the work on application oriented technology development. Currently he is principal investigator of the ExascaleInnovation Center, the NVIDIA Application Lab at Jülich as well as the newly established POWER Acceleration and Design Center. He has played a leading role in several projects for developing massively-parallel special purpose computers, including QPACE.

Speaker’s organization

Forschungszentrum Jülich – a member of the Helmholtz Association – is one of the largest research centres in Europe and member of the OpenPOWER Foundation since March 2014. It pursues cutting-edge interdisciplinary research addressing the challenges facing society in the fields of health, energy and the environment, and information technologies. Within the Forschungszentrum, the Jülich Supercomputing Centre (JSC) is one of the three national supercomputing centres in Germany as part of the Gauss Centre for Supercomputing (GCS). JSC operates supercomputers which are among the largest in Europe.


Download Presentation

Back to Summit Details

Using Docker in High Performance Computing in OpenPOWER Environment

Introduction to Authors

Min Xue Bin: Male, IBM STG China, advisory software engineer, LSF developer, mainly focus on High Performance Computing.
Ding Zhao Hui: Male, IBM STG China, Senior LSF architect, mainly focus on LSF road map.
Wang Yan Guang: Male, IBM STG China, development manager for LSF/LS.


OpenPOWER will be one of major platforms in High Performance Computing (HPC). IBM Load Sharing Facility (LSF) is the most famous cluster workload management software aimed to explore computation capacity of clusters to the maximum in HPC, and LSF is proved running well on OpenPOWER platform. As an open platform for developers and system administrators to build, ship and run applications, Docker has been widely used in cloud. Could we extend Docker benefits to HPC? Yes, we do. By integrating LSF and Docker in OpenPOWER platform, we achieved better application Docking in OpenPOWER HPC.


In HPC, there are lots of complex customer workloads which depend on multi-packages, libraries, and environment. It is hard to control customer workload resource guarantee, performance isolation, application encapsulation, repeatability and compliance.

Our experience

We enabled LSF work in openPOWER environment, starting from IBM Power8 Little Endian. We also port Docker to the platform too. Based on that, we finished integration between LSF and Docker to extend its benefits to openPOWER HPC area.


Download Presentation

Back to Summit Details

Introducing the Little-Endian OpenPOWER Software Development Environment and its application programming interfaces

Presented by: Michael Gschwind

Over the past three decades, the Power Architecture has been an important asset in IBM’s systems strategy.  During the time, Power-based systems powered desktops, technical workstations, embedded devices, game consoles, supercomputers and commercial UNIX servers.

The ability to adapt the architecture to new requirements has been key to its longevity and success.  Over te past several years, a new class of computing solutions has emerged in the form of dedicated data center scale computing platforms to power services such as search and social computing. Data center-level applications most often involve data discovery and/or serving from large repositories.  Applications may either be written in traditional object-oriented languages such as C++, or in new dynamic scriptiong languages such as JavaScript, PHP, Python, Ruby, etc.

Because many datacenters use custom-designed servers, these applications have suffered from lock-in into merchant-silicon processors optimized for desktop environments.  The new Open Power consortium creates an alternative to the x86 lock in, by creating an open source ecosystem that offers ease of porting from processors currently used in datacenters.

Unix and Linux applications have offered great portability in the past, but required some investment to avoid processor-specific code patterns.  To simplify porting of applications to the new Open Power environment, we reengineered the Open Power environment to simplify porting of software stacks and entire systems.

One particularly pervasive dependence is  byte ordering of data.  Byte ordering affects both the layout of data in memory, and of disk-based data repositories. While Power had supported both big-endian (most significant byte first) and little-endian (least significant byte first) data orderings, common Power environments have always used big-endian ordering.  To address endian-ness, Power8 was defined to offer the same high performance for big- and little-endian applications.  Building on that hardware capability, Open Power defines a new little-endian execution environment.  The new little-endian environment exploits little-endian execution.  In addition compiler built-ins functions handle transformation of data orderings that cannot be readily changed with a endian configuration switch, such as the ordering of vector elements in the SIMD execution units.

Introducing a new data layout necessarily breaks binary compatibility which created an opening to create a new Application Binary Interface governing the interoperation of program modules, such as data layout and function calling conventions.

To respond to changes in workload behavior and programming patterns, we co-optimized hardware and software to account for evolution of workloads since the original introduction of Power:
1.        Growth in memory and data sizes:
In modern applications, external variables are accessed via data dictionaries (GOT or TOC) holding the address of all variables.  The original IBM GOT to access global variables was restricted to 64KB, or 8000 variables, per module reflecting the ability to use 16bit offsets in Power load and store instructions, which was becoming a limitation for enterprise applications and complicated the application build process and/or degraded performance.

Power8 can combine multiple instructions into a single internal instruction with a large 4GB offset  We introduced a new “medium code model” in the ABI which takes advantage of displacement fusion to support GOTs with up to 500 million variables.  By default, compilers and linkers generate fusable references for the medium code model.

2.        Accelerate data accesses by “inlining” data in the dictionary:
With the growth in dictionary size enabled by displacement fusion, it becomes possible to include data objects in the GOT rather than only including a pointer to the object.  This improves reduces the amount of accesses necessary to retrieve application data from emmory and improves cache locality.

3.         Eliminate penalties for data abstraction:
To make object oriented programs as efficient as their FORTRAN equivalent, we expanded the passing of input and output parameters  in registers to classes.  Classes can now use up to eight floating point or vector registers per input or output parameter.  This makes it possible to code classes for complex numbers, vertices, and other abstract types as efficient as built-in types.

4.        Accelerate function calls:
Object oriented programming has led to a marked shift in programming patterns in application programs with the average size of application programs dropping from millions of instructions per function for FORTRAN codes to tens of instructions in object oriented applications.  Consequently, reducing the fixed cost per function invocation is more important than before.

Previously, the Power ABI made initializing a callee’s entire environment the responsibility of glue code hidden from programmers and compilers on cross module calls.  To ensure environments are properly initialized for all languages, the generated glue code had to conservatively assume for these functions that addressability must be established for the new module.  Linux requires all externally visible functions to be resolved at runtime, extending the cost of dynamic linking to most functions that will ultimately resolve to calls within a module.

The new ABI makes the called function responsible to set up its own environment.  In addition, each function can have two entry points, one for local calls from within the same module to skip initialization code when no setup is necessary (this local entry point can be used either for direct calls, or via the dynamic linkage code).

5.        Simplify and accelerate function pointer calls:
The previous Power ABI had focused on providing functional completeness by representing each function pointer as a data structure (sometimes called a “function descriptor”) encapsulating static and dynamic environments with 3 pointers for instruction address, static and dynamic environment, to support a broad and diverse set of languages, including FORTRAN, Algol, PL/1, PL.8, PL.9, Pascal, Modula-2, and assembly.  Using such a function pointer structure, each caller could set up the environment for the callee when making a function pointer call.

Alas, with the introduction of self-initializing functions and no practical need to optimize performance for Pascal and Modula-2, the function descriptor offers little advantages, but incurs three extra memory references that must be made, and that are in the critical path of function call and data accesses.  Thus, the new ABI represents function pointers as the instruction address of a function’s first instruction.

In addition to these ABI improvements, the new OpenPOWER software environment also includes two new SIMD vector programming API optimized for the little-endian programming environment that uses fully little-endian conventions for referencing data structures and vector elements within the Power SIMD vector processing unit.  Where necessary, the compiler translates these new little-endian conventions to the underlying big-endian hardware conventions.  This is particularly useful to write native little-endian SIMD vector applications, or when porting SIMD vector code from other little-endian platforms.

In addition, the compilers can also generate code for big-endian vector conventions but using little-endian data – an environment that is particularly useful for porting libraries originally developed for big-endian Power, such as IBM’s tuned mathematics libraries which can support both big- and little-endian environments with a common source code.

In order to simplify programming and enable code portability, we define two SIMD vector programming models: a natively little-endian model and a portability model for code developed on or shared with big-endian Power platforms. To efficiently implement these models, we extend compiler optimizations to optimize vector intermediate representations to eliminate data reformatting primitives. In addition to describing a framework for SIMD portability and for optimizing SIMD vector refor-matting, we implement a novel vector operator optimization pass and measure its effectiveness: our implementation eliminates all data refor-matting from application vector kernels, resulting in a speedup of up 65% for a Power8 microarchitecture with two fully symmetric vector execution units.


Download Presentation

Back to Summit Details

Changing the Game: Accelerating Applications and Improving Performance For Greater Data Center Efficiency


Planning for exascale, accelerating time to discovery and extracting results from massive data sets requires organizations to continually seek faster and more efficient solutions to provision I/O and accelerate applications.  New burst buffer technologies are being introduced to address the long-standing challenges associated with the overprovisioning of storage by decoupling I/O performance from capacity. Some of these solutions allow large datasets to be moved out of HDD storage and into memory quickly and efficiently. Then, data can be moved back to HDD storage once processing is complete much more efficiently with unique algorithms that align small and large writes into streams, thus enabling users to implement the largest, most economical HDDs to hold capacity.

This type of approach can significantly reduce power consumption, increase data center density and lower system cost. It can also boost data center efficiency by reducing hardware, power, floor space and the number of components to manage and maintain. Providing massive application acceleration can also greatly increase compute ROI by returning wasted processing cycles to compute that were previously managing storage activities or waiting for I/O from spinning disk.

This session will explain how the latest burst buffer cache and I/O accelerator applications can enable organizations to separate the provisioning of peak and sustained performance requirements with up to 70 percent greater operational efficiency and cost savings than utilizing exclusively disk-based parallel file systems via a non-vendor-captive software-based approach.

Speaker Bio

Jeff Sisilli, senior director of product marketing at DataDirect Networks, has over 12 years experience creating and driving enterprise hardware, software and professional services offerings and effectively bringing them to market. Jeff is often quoted in storage industry publications for his expertise in software-defined storage and moving beyond traditional approaches to decouple performance from capacity.

Speaker Organization

DataDirect Networks


Download Presentation

Back to Summit Details

How Ubuntu is enabling OpenPOWER and innovation Randall Ross (Canonical)


Geared towards a business audience that has some understanding of POWER and cloud technology, and would like to gain a better understanding of how their combination can provide advantages for tough business challenges.


Learn how Canonical’s Ubuntu is enabling OpenPOWER solutions and cloud-computing velocity. Ubuntu is powering the majority of cloud deployments. Offerings such as Ubuntu Server, Metal-as-a-service (MAAS), hardware provisioning, orchestration (Juju, Charms, and Charm Bundles), workload provisioning, and OpenStack installation technologies simplify managing and deploying OpenPOWER based solutions in OpenStack, public, private and hybrid clouds. OpenPOWER based systems are designed for scale-out and scale-up cloud and analytics workloads and are poised to become the go-to solution for the world’s (and your businesses’) toughest problems.

This talk will focus on the key areas of OpenPOWER based solutions, including

  • Strategic POWER8 workloads
  • Solution Stacks that benefit immediately from OpenPOWER
  • CAPI (Flash, GPU, FPGA and acceleration in general)
  • Service Orchestration
  • Ubuntu, the OS that fully supports POWER8
  • Large Developer Community and mature development processes
  • Ubuntu’s and OpenPOWER’s Low-to-no barrier to entry

Speaker names / Titles

Randall Ross (Canonical’s Ubuntu Community Manager, for OpenPOWER & POWER8)
Jeffrey D. Brown (IBM Distinguished Engineer,  Chair of the OpenPOWER Foundation Technical Steering Committee) – proposed co-presenter, to be confirmed


Download Presentation

Back to Summit Details

Accelerator Opportunities with OpenPower


The OpenPower architecture provides unique capabilities which will enable highly effective and differentiated acceleration solutions.   The OpenPower Accelerator Workgroup is chartered to develop both hardware the software standards which provide vendors the ability to develop these solutions.  The presentation will cover an overview of the benefits of the OpenPower architecture for acceleration solutions.   We will provide an overview of the Accelerator Workgroups plans and standards roadmap.   We will give an overview of the OpenPower CAPI development kit.   We will also walk through an example of a CAPI attached acceleration solution.

Presentation agenda

  • Overview of opportunity for OpenPower acceleration solutions
  • OpenPower Accelerator workgroup charter and standards roadmap
  • OpenPower CAPI Development Kit
  • CAPI attached acceleration solution example


Nick Finamore, Altera Corporation
Product Marketing Manager for Software Development Tools  Chairperson,  OpenPower Foundation Accelerator Workgroup

For the past 3 years Nick has been leading Altera’s computing acceleration initiative and the marketing  of Altera’s SDK for OpenCL.  Previously Nick was in several leadership positions at early stage computing and networking technology companies including Netronome, Ember(SiLabs) and Calxeda.   Nick also had an 18 year career at Intel where he held several management positioning including general manager of the network processor division.


Download Presentation

Back to Summit Details

Enabling financial service firms to compute heterogeneously with Gateware Defined Networking (GDN) to build order books and trade with the lowest latency.

Abstract and Objectives

Stock, futures, and option exchanges; market makers; hedge funds; and traders require real-time  knowledge of the best bid and ask prices for the instruments that they trade. By monitoring live market data feeds and computing an order book with Field Programmable Gate Array (FPGA) logic, these firms can track the balance of pending orders for equities, futures, and options with sub-microsecond latency. Tracking the open orders by all participants ensures that the market is fair, liquidity is made available, trades are profitable, and jitter is avoided during bursts of market activity.

Algo-Logic has developed multiple Gateware Defined Networking (GDN) algorithms and components to support ultra-low-latency processing functions in heterogeneous computing systems. In this work, we demonstrate an ultralow latency order book that runs in FPGA logic in an IBM POWER8 server, which includes an ultra-low-latency 10 Gigabit/second Ethernet MAC, a market data feed handler, a fast key/value store for tracking level 3 orders, logic to sort orders, and a standard PSL interface which transfers level 2 market snapshots for multiple trading instruments into shared memory. Algo-Logic implemented all of these algorithms and components in logic on an Altera Stratix V A7 FPGA on a Nallatech CORSA card. Sorted L2 books are transferred over the IBM CAPI bus into cache lines of system memory. By implementing the entire feed processing module and order book in logic, the system enables software on the POWER8 server to directly receive market data snapshots with the least possible theoretical latency and jitter.

As a member of the Open Power Foundation (OPF), Algo-Logic provides an open Application Programming Interface (API) that allows traders to select which instruments they wish to track and how often they want snapshots to be transferred to memory. These commands, in turn, are transferred across the IBM-provided Power Service Layer (PSL) to the algorithms that run in logic on the FPGA. Thereafter, trading algorithms running in software on any of the 96 hyper-threads in a two-socket POWER8 server can readily access the market data directly from shared memory. When combined with a Graphics Processing Unit, a dual-socket POWER8 system optimally leverages the fastest computation from up to 96 CPU threads, high-throughput vector processing from hundreds of GPU cores, and the ultra-low latency from thousands of fine-grain state machines in FPGA logic to implement a truly heterogeneous solution that achieves better performance than could be achieved with homogeneous computation running only in software.

Presenter Bio

John W. Lockwood, CEO of Algo-Logic Systems, Inc., is an expert in building FPGA-accelerated applications. He has founded three companies focused on low latency networking, Internet security, and electronic commerce and has worked at the National Center for Supercomputing Applications (NCSA), AT&T Bell Laboratories, IBM, and Science Applications International Corp (SAIC). As a professor at Stanford University, he managed the NetFPGA program from 2007 to 2009 and grew the Beta program from 10 to 1,021 cards deployed worldwide. As a tenured professor, he created and led the Reconfigurable Network Group within the Applied Research Laboratory at Washington University in St. Louis. He has published over 100 papers and patents on topics related to networking with FPGAs and served as served as principal investigator on dozens of federal and corporate grants. He holds BS,

MS, PhD degrees in Electrical and Computer Engineering from the University of Illinois at Urbana/Champaign and is a member of IEEE, ACM, and Tau Beta Pi.

About Algo-Logic Systems

Algo-Logic Systems is a recognized leader of Gateware Defined Networking® (GDN) solutions built with Field

Programmable Gate Array (FPGA) logic. Algo-Logic uses gateware to accelerate datacenter services, lower latency in financial trading networks, and provide deterministic latency for real-time Internet devices. The company has extensive experience building datacenter switches, trading systems, and real-time data processing systems in reprogrammable logic.


Download Presentation

Back to Summit Details

The Disruptive Technology of OpenPOWER

The OpenPOWER Foundation is certainly carrying some strong momentum as it enters its second year. As we look forward there are many things still to be done to take the next step on our journey towards creating a broadly adopted, innovative and open platform for our industry. I will share my Top Ten List of OpenPOWER Projects to Disrupt the Data Center. Anything and everything is fair game on this list across the all disciplines, technologies and markets. Come join us in a fun look at how the OpenPOWER Foundation will continue to shake up the Data Center.


Dr. Bradley McCredie is an IBM Fellow, Vice President of IBM Power Systems Development and President of the OpenPOWER Foundation. Brad first joined IBM focusing on packaging for IBM’s mainframe systems. He later took a position within the IBM Power Systems development organization and has since worked in a variety of development and executive roles for POWER-based systems. In his current role, he oversees the development and delivery of IBM Power Systems that incorporate the latest technology advancements to support clients’ changing business needs.

Back to Summit Details

OpenPOWER Foundation Technical Initiatives


As the Chair of the OpenPOWER Technical Steering Committee Mr. Brown will be discussing the technical agenda of the OpenPOWER Foundation and the structure of foundation workgroups.  He will describe the scope and objectives of key workgroups as well as their relationships to each other.  A roadmap of workgroup activities will illustrate when the community can expect key results.  The presentation will also cover three of the key initiatives within the OpenPOWER Foundation.  These initiatives involve work recently started to enable active foundation member engagement in workgroups focused on application solution sets IO device enablement, and compliance.  Mr. Brown will be joined by Randall Ross of Canonical who will cover application solution sets,  Rakesh Sharma of IBM who will cover broader IO device enablement, and Sandy Woodword of IBM who is the chair of the compliance workgroup .  Please join us for this in depth look at the OpenPOWER Foundation’s technical activities and how we will enable ecosystem members to deliver solutions.


Jeffrey D. Brown, IBM Server and Technology Group.  Jeff is an IBM Distinguished Engineer and member of the IBM Academy of Technology.  He received a B.S. in Electrical Engineering and a B.S. in Physics from Washington State University in 1981.  He received his M.S. degree in Electrical Engineering from Washington State University in 1982.  Jeff has over 25 years of experience in VLSI development including processor, memory, and IO subsystem development projects for IBM multi-processor systems and servers.  He is the coauthor of more than 40 patent filings.  He has been the Chief Engineer on several processor and SOC chip development programs including Waternoose for the XBOX360 and Power Edge of Network.  Jeff is currently actively involved in the OpenPOWER Foundation and chairs the Technical Steering Committee.

Sandra Woodward is the OpenPOWER Foundation Compliance Work Group Chair. She received her B.S. in Electrical Engineering from University of Nebraska Lincoln and her M.S. degree in Electrical Engineering from Syracuse University.  Sandy has over 20 years of  experience with the POWER architecture.  She is a Senior Technical Staff Member at IBM, is an IBM Academy of Technology Member, and a member of the Society of Women Engineers and Women in Technology International.

Rakesh Sharma is IBM POWER Systems I/O Chief Engineer and is focused on OpenPOWER I/O.  He chairs OpenPOWER I/O Workgroup Chartering Committee. He received his Bachelors in Electrical Engineering from IIT-Kanpur, India and Masters in Computer Science from North Dakota State University, Fargo. Rakesh has over 20 years of experience with the POWER architecture specializing in I/O, Virtualization and Networking. He is a Senior Technical Staff Member and is an IBM Master Inventor.


Download Presentation

Back to Summit Details

Advancing the OpenPOWER vision


It’s been nearly a year since the public launch of OpenPower and the community of technology leaders that make up our community have made significant progress towards our original goals. While growth of the membership is a critical factor, our success will come from the technology provided through the ‘open model’ and the ‘value’ solutions that are enabled by leveraging that technology. Please join us as we highlight the key components that our member community have contributed to that ‘open model’ and spotlight some examples of high value solutions enabled through members leveraging our combined capabilities and strengths.


Gordon MacKean is a Sr. Director with the Hardware Platforms team at Google. He leads the team responsible for the design and development of the server and storage products used to power Google data centers. Prior to Google, Gordon held management and design roles at several networking companies including Matisse Networks, Extreme Networks, and Nortel Networks. Gordon is a founder of OpenPOWER Foundation and serves as the Chairman of the Board of Directors. Gordon holds a Bachelors degree in Electrical Engineering from Carleton University.


Download Presentation

Back to Summit Details

Driving Open Collaboration in the Datacenter

Rapid Growth of the OpenPOWER Foundation Reflects the Need for IT Collaboration and Innovation that Extends Down to the Chip

By Calista Redmond, Director, OpenPOWER Global Alliances, IBM

The computer industry is going through radical change, triggered by increasing workloads and decreasing chip performance gains, and OpenPOWER is innovating to meet the challenge.

In August 2013, IBM, Google, Mellanox, NVIDIA and Tyan announced plans to form OpenPOWER. The OpenPOWER Foundation was incorporated as a legal entity in December 2013. The last twelve months have brought us rapid membership growth across all layers of the stack – from chip to end users – and OpenPOWER members are already innovating and bringing offerings to market.

As an open, not-for-profit technical membership group, the Foundation makes POWER hardware and software available for open development, as well as POWER intellectual property licensable to other manufacturers. The result is an open ecosystem, using the POWER Architecture to share expertise, investment, and server-class intellectual property to address the evolving needs of customers and industry.

Why OpenPOWER? Why Now?

To understand why the industry is transforming so quickly, it’s important to recognize the industry forces that brought us here. There are a number of developments that have become clear and that inspired this new strategic shift for IBM and for the industry:

  1. Silicon is not enough. Moore’s Law predictions of performance improvements with each new generation of silicon have hit a physics wall and are no longer satisfying the price/performance ratios that clients and end users are looking for.
  2. Different and growing workload demands. There is a tsunami of data flooding into organizations. In order to effectively manage the volume, address governance requirements, and get more value from data through analytics, data centers need to make adjustments to optimize for the new workload demands. This evolution is true today and will continue to change in the future. It is no longer satisfactory to take an all-purpose machine and deploy it for every workload. More specialization is required.
  3. Changing consumption model of IT. The consumption model for many end users has become the cloud. Increasingly, users want to pay as they go and turn their IT services on and off like a utility. That has also led to cloud providers facing the need to specialize the hardware they deploy in their own data centers in order to effectively support this increasingly popular consumption model. Very large internet data centers and cloud service providers want to build their own, optimizing on price performance.
  4. The continued momentum and maturity of the open source software ecosystem. Open source software has taken off. It has become a very mature ecosystem delivering at enterprise class and growing stronger every day. There is more and more reliance on the open software model.

These four trends led IBM to reflect on its own strategy. To address new challenges, IBM needed to lead the industry change. Today, the OpenPOWER Foundation is addressing that need by becoming the catalyst for open innovation that is necessary throughout the entire stack, from chip through software.

Innovation and Customization Down to the Chip

With the OpenPOWER Foundation, open development spanning software, firmware and hardware is the catalyst for change.

The OpenPOWER Foundation acts as an enabler in the industry, bringing together thought leaders across multiple parts of the IT stack to innovate together. Rather than doing innovations one at a time – one partner at a time – organizations can do them in workgroups with multiple thought leaders and experts interacting together. This means innovation can be attained at multiple levels simultaneously so that there is much greater potential of beating the price/performance curve. The result is that we are creating an optimized software ecosystem, leveraging little endian Linux so software ports easily from x86 systems.

Within the POWER chip, IBM has implemented CAPI (Coherent Accelerator Processor Interface), a capability that allows co-processors to attach directly to the POWER processor, making it easier and faster to offload tasks to specialized processors. CAPI enables systems designers to customize their systems specifically for their own workloads and user demands. By opening up the software and the hardware, right down to the chip, the Foundation is providing a forum for innovation – and making the results of that innovation broadly available.

This is creating an optimized software ecosystem that is enabling a spectrum of Power servers in the market today. Today we have OpenPOWER members designing 12 specific versions of POWER systems around the world. This is merely the beginning of the proliferation of POWER systems we expect to see from OpenPOWER.

This of the OpenPOWER model as a buffet-style approach where organizations can pick and choose what is going to work absolutely best for their particular workload. Essential elements may include memory, I/O, or acceleration as some examples with multiple options.

Addressing the Emerging TCO Challenge

When we go out and talk to clients – and at IBM we are talking to end users every day – we used to have a total cost of ownership discussion that fit on one screen of a laptop. There were about six dials that they wanted to tune for their particular data center. Today, that TCO analysis is often many pages. There are many variables that organizations would like to fine-tune for the specific workloads but yet they also have a strong desire to simplify and to maximize their investment in the right number of configurations for their data center.

Through the OpenPOWER Foundation, organizations are able to customize how they consume technology by making adjustments based on the POWER Architecture. There are other architecture options out there, but ours is the most open and the most mature for the enterprise data center. Delivering open choice, riveting performance, and competitive TCO pricing strengthen the long term value proposition our end users are realizing.

POWER Architecture Momentum

December 2014 is the first anniversary of the incorporation of the OpenPOWER Foundation.

We have worked very hard to get solutions and hardware reference boards available for the public launch which was announced in April 2014. By then, we had more than two dozen members. In July, we had contributed the POWER8 firmware to open source, providing a significant signal to the market that we are very serious about enabling innovation and optimization going all the way down to the hardware level. Today, we count more than 80 OpenPOWER members. We are growing globally and now have more than a dozen members in Europe and over 20 members in Asia.

Our members’ involvement is spread across the stack from the chip level with hardware optimizations of I/O and memory, and acceleration options, and we are growing now into software. OVH, a leading internet hosting provider based in France, has just launched an on-demand cloud service based on the IBM POWER8 processor, tuned specifically for big data, high performance computing, and database workloads. In the US, Rackspace just announced their intentions to fuse the best of OpenPOWER, Open Compute, and OpenStack to drive an ultimately open data center design for cloud providers and scale out data centers.

We are also continuing to have conversations with nations that are interested in furthering their own unique domestic IT agenda as well as with large internet data centers that are moving very quickly into proof-of-concept stage with specific design points that they would like to hit for their data centers.

Some of the key milestones the OpenPOWER Foundation has made possible include:

  • The introduction of the IBM Data Engine for NoSQL – Power Systems Edition, which features the IBM FlashSystem, and is the first solution to take advantage of CAPI, and speeds input/output and enables massive server consolidation.
  • The launch of the Power System S824L, which leverages OpenPOWER Foundation technology to accelerate Java, big data and technical computing applications. Here, you see an 8x faster performance on analytics workloads and that is leveraging OpenPOWER innovations together with NVIDIA, which does GPU acceleration.
  • The availability of the first non-IBM Power System now available from TYAN, a white box provider in Taiwan.
  • Collaboration across Jülich, NVIDIA, and IBM on a supercomputing center in Europe
  • Endorsement by the U.S. Department of Energy on the next generation supercomputing with a $325M contract award to OpenPOWER members
  • Launch of a CAPI with FPGA Acceleration developers kit together with Altera and Nallatech
  • Contribution of OCC firmware code for acceleration and energy management

We now have six different workgroups spread across the software and hardware layers, as well as in the area of compliance, which are making progress on deliverables. We also have another five workgroups that are in proposal stages. And, we are continuing to expand our client deployments.

We understand that it is no longer possible to accomplish what is needed at the software layer alone. What is needed is an open innovation model that goes all the way down to the chip. This is a mission no single company can or should drive alone. While we’re impressed with the momentum of this year, the strategy we’re on is taking root within the industry as thought leaders across the growing OpenPOWER community join in driving a new path forward.

Happy first birthday OpenPOWER!

OCC Firmware Code is Now Open Source

by Todd Rosedahl, Chief Energy Management Engineer on POWER

Today, IBM has released another key piece of infrastructure to the OpenPOWER community. The firmware that runs on the On Chip Controller (OCC), along with the host code that loads and initializes it, has been open sourced. The OCC provides access to detailed chip temperature, power, and utilization data, as well as complete control of processor frequency, voltage, and memory bandwidth. This enables customization for performance and energy management, or for maintaining system reliability and availability. Partners now have the flexibility to create innovative power, thermal, and performance solutions on POWER systems.


The OCC is a separate 405 processor that is embedded directly on the chip along with the main POWER processor cores. It has its own dedicated 512K SRAM, access to main memory, and 2 dedicated General Purpose off-load Engines (called GPEs). The main firmware runs a 250usec loop that utilizes the GPEs to continuously collect system power data by domain, processor temperatures, memory temperatures, and processor utilization data. The firmware communicates with the open source OpenPOWER Abstraction Layer (OPAL) stack via main memory. In conjunction with the operating system, it uses the data collected to determine the proper processor frequency and memory bandwidth to enable the following functions:

Performance Boost
The POWER processors can be set to frequencies above nominal. The OCC monitors the system and controls the processor frequency and memory bandwidth to keep the system thermally safe and within acceptable power limits.

Power Capping
A system power limit can be set. The OCC will continually monitor the power consumption and will reduce the allowed processor frequency to maintain that power limit.

Energy Saving
When the system utilization is low, the OCC infrastructure can be used to put the system into a low power state. This function can be used to comply with various government idle power regulations and standards.

System Availability
The OCC supports a Quick Power Drop signal that can be used to respond to power supply failures or other system events that require a rapid power reduction. This function enables systems to run through component or data center power and thermal failures without crashing.

System Reliability
The OCC can be used to keep component temperatures within reliability limits, extending device lifetime and limiting service costs.

Performance per Watt tuning
As the system utilization varies, the OCC can provide monitoring information and frequency control that maximizes system performance per watt metrics.

These basic functions can be implemented, enhanced, and expanded. Additionally, completely new functions can be developed using the OCC open source firmware and accompanying framework. See code at and documentation at on GitHub for more information. For additional details, please reference the video at

Porting GPU-Accelerated Applications to POWER8 Systems

By Mark Harris

With the US Department of Energy’s announcement of plans to base two future flagship supercomputers on IBM POWER CPUs, NVIDIA GPUs, and NVIDIA NVLink interconnect, many developers are getting started building GPU-accelerated applications that run on IBM POWER processors. The good news is that porting existing applications to this platform is easy. In fact, smooth sailing is already being reported by software development leaders such as Erik Lindahl, Professor of Biophysics at the Science for Life Laboratory, Stockholm University & KTH, developer of theGROMACS molecular dynamics package:

The combination of POWER8 CPUs & NVIDIA Tesla accelerators is amazing. It is the highest performance we have ever seen in individual cores, and the close integration with accelerators is outstanding for heterogeneous parallelization. Thanks to the little endian chip and standard CUDA environment it took us less than 24 hours to port and accelerate GROMACS.

The NVIDIA CUDA Toolkit version 5.5 is now available with POWER support, and all future CUDA Toolkits will support POWER, starting with CUDA 7 in 2015. The Tesla Accelerated Computing Platform enables multiple approaches to programming accelerated applications: libraries (cuBLAS, cuFFT, Thrust, AmgX, cuDNN and many more), compiler directives (OpenACC), and programming languages(CUDA C++, CUDA Fortran, Python). You can use any of these approaches on GPU-accelerated systems based on x86, ARM, and now POWER CPUs, giving developers and system builders a choice of technologies for development and deployment.


The GPU portions of your application code don’t need to change when porting to POWER, and for the most part, neither do the CPU portions. GPU-accelerated code will generally perform the same on a POWER+GPU system compared to a similarly configured x86+GPU system (assuming the same GPUs in both systems).

Porting existing Linux applications to POWER8 Linux on Power (LoP) is simple and straightforward. The new POWER8 Little Endian (LE) mode makes application porting even easier by eliminating data conversion complications. Even so, when targeting a new CPU, it’s useful to know the tools available for achieving highest performance. By knowing a handful of useful compiler flags and directives, you can get performance improvements right out of the gate. The following flags and directives are specific to IBM’s xlc compiler.

Useful Compiler Options and Directives

POWER8 is known for its low latency and its high-bandwidth memory and SMT8 capabilities (8 simultaneous hardware threads per core). The -qarch and -qtune flags come in handy for automatic exploitation of the POWER8 ISA.

-qarch=pwr8 -qtune=pwr8

For SMT-aware tuning, you can use sub-options to the –qtune option to specify the exact SMT mode. The options are balanced, st (single thread), smt2, smt4 or smt8. SMT-aware optimizations allow for locality transformation and instruction scheduling.

In addition to SMT tuning, automatic data prefetching, automatic SIMDization and Higher-Order Transformations (HOT) on loops can be enabled using -O3 –qhot. For best out-of-the-box results, you can combine options.

-O3 qhot qarch=pwr8 qtune=pwr8

The automatic SIMDization compiler flag guarantees limited use of control flow pointers. The loop directive #pragma independent, can be used to tell the compiler a loop has no loop-carried dependencies. Use either the restrict keyword or the disjoint pragma when possible to tell the compiler that references do not share the same physical storage. Expose stride-one access when you can to limit strided accesses.

By adding these flags and directives to your bag of tricks, you can significantly improve your application performance out of the box.

Get Started Now

IBM Redbook on POWER8 optimizationFor more performance optimization and tuning techniques (e.g.: dynamic SMT selection, gcc specifics, etc.), please refer to Chapter 6 (Linux) in “Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8”.

Visit this IBM PartnerWorld page for information about developer access to POWER systems for evaluation, developing, and porting. POWER+GPU system access is available upon request.

Joining the CUDA registered developer program is your first step in establishing a working relationship with NVIDIA Engineering. Membership gives you access to the latest software releases and tools, notifications about special developer events and webinars, and access to report bugs and request new features.

The OpenPOWER Foundation was founded in 2013 as an open technical membership organization that will enable data centers to rethink their approach to technology. Member companies are enabled to customize POWER CPU processors and system platforms for optimization and innovation for their business needs. These innovations include custom systems for large-scale data centers, workload acceleration with GPUs, FPGAs or advanced I/O, platform optimization for SW appliances, and advanced hardware technology exploitation. Visit to learn more.

SC14: OpenPOWER and the State of Supercomputing

By Ken King, GM of OpenPOWER Alliances, IBM

Last week, SC14 brought together the brightest minds and organizations in the high-performance computing (HPC) industry. It was truly exciting to discuss with these experts some of the business challenges HPC is tackling today – from medical research to investment banking to weather forecasting – and possibilities for the future. IBM and the OpenPOWER Foundation shared our vision for the future of technical computing, in which open innovation leads to accelerated and more compelling development of HPC systems.

IBM kicked off the show highlighting our recently announced $325M contract award from the U.S. Department of Energy (DOE) to develop and deliver advanced “data centric” supercomputing systems, which will advance discovery in science, engineering and national security. In a move that could shake up the high performance computing industry, IBM’s new OpenPOWER-based systems use a data centric approach and put computing power everywhere data resides, minimizing data in motion, energy consumption, and cost/performance. These systems are the debut of OpenPOWER innovation in supercomputing and the result of the collaboration of OpenPOWER Foundation members, including IBM, NVIDIA and Mellanox.

The DOE project is just the beginning when it comes to how IBM and the OpenPOWER Foundation plan to revolutionize supercomputing. The fact is that traditional supercomputing approaches are no longer keeping up with the enormous growth of big data and Moore’s Law can no longer be relied on for historical performance gains; the industry needs open collaboration to develop the data centric, high performance systems required to tackle today’s biggest challenges. That’s where the OpenPOWER community comes in, as a force of material innovation vital to shaping the future of technical computing. With more than 70 companies, including NVIDIA, Mellanox, Altera and Nallatech, the OpenPOWER Foundation is incorporating advanced technology like GPUs, NICS and FPGA cards, all of which have the potential to transform today’s supercomputing capabilities in an open, integrated fashion. With these possibilities, the future of supercomputing is becoming more open than ever – and the OpenPOWER Foundation is leading the way.

In addition to the forward-thinking that was on display at SC14, we were also excited to see that the HPC industry is recognizing the disruptive potential of the combination of IBM Power Systems and OpenPOWER innovations. IBM won several HPCwire awards, including an Editor’s Choice for Best HPC Server Product or Technology for IBM POWER8 processor-based systems, recognizing Power’s superior performance for HPC systems. The OpenPOWER Foundation also won an Editor’s Choice for Top 5 New Products or Technologies to Watch for its potential to transform how HPC systems are built in the future.

We were pleased have an impact on the annual SuperComputing event and to share our vision for the future of technical computing. We look forward to returning in the years to come to further showcase how the new data centric paradigm and open collaborative innovation in supercomputing (via OpenPOWER) is transforming the industry.

Mellanox and OpenPOWER Partners Sponsor “Innov8 with POWER8” Academic Challenge

By Scot Schultz, Director of HPC and Technical Computing, Mellanox Technologies

Mellanox has partnered with IBM and a group of fellow OpenPOWER Foundation member companies, including NVIDIA, Altera, to launch a brand new academic challenge for computer science graduate students.  Called “Innov8 with POWER8,” the program involves three top universities – including North Carolina State University, Rice University and Oregon State University.  This fall semester, each school was provided with OpenPOWER compatible IBM POWER8 Power Systems enabled with Mellanox’s industry leading interconnect. The goal of the challenge is to enable the students to leverage the OpenPOWER server platform to drive innovation on variety of specialized projects, each focused on the themes of either Big Data, genomics or cloud computing.

Earlier this year, at the IBM Impact 2014 conference, Mellanox demonstrated a 10x improvement in throughput and latency of a key value store application on POWER8 architecture. Mellanox Host Channel Adapters provide the highest performing interconnect solution for enterprise data centers, HPC and cloud computing and are also capable of remote direct memory access (RDMA). RDMA  allows direct memory access from remote systems with the involvement of the operating system or other CPU resources, coupled with the innovative OpenPOWER compatible POWER8 architecture, making it the perfect platform for the universities to accelerate  research and development for real-world challenges. The projects are already in development, but the initial scope of project work looks exciting. Take a look below at what the universities will be working on this semester:

North Carolina State University
NCSU’s projects address real-world bottlenecks in deploying big data solutions. NCSU has built up a strong set of skills in Big Data, having worked closely with the IBM Power Systems team to push the boundaries in delivering what clients need.  These projects extend their work to the next level, taking advantage of the accelerators that are a core element of the POWER8 value proposition.

  • Project 1: NCSU will focus on Big Data optimization, accelerating the preprocessing phase of their Big Data pipeline with Power-optimized,  coherently attached reconfigurable accelerators in FPGAs from Altera. The team will assess the work from the IBM Zurich Research Laboratory on text analytics acceleration, aiming to eventually develop their own accelerators.
    · Project 2: The University’s second project focuses on smart storage. The team is looking to leverage the Zurich accelerator in the storage context as well.

Rice University
Rice University has recognized that genomics information consumes massive datasets; however, developing the infrastructure required to rapidly ingest, perform analytics, and store this information is a challenge. Rice’s initiatives, in collaboration with NVIDIA and Mellanox, are designed to accelerate the adoption of these new big data and analytics technologies in medical research and clinical practice.

  • Project 1: Rice students will exploit the massive parallelism of GPU accelerator technology and linear programming algorithms to provide deeper understanding of basic organism biology, genetic variation and pathology, and adopting a multi-GPU implementation of the simplex algorithm to genome assembly and benchmarking.
  • Project 2: Students will develop new approaches to high-throughput systematic identification of chromatin loops between genomic regulatory elements, utilizing GPUs to in-parallel and efficiently search the space of possible chromatin interactions for true chromatin loops.

Oregon State University
Oregon State University’s Open Source Lab has been a leader in open source cloud solutions on Power Systems, even providing a cloud solution hosting for more than 160 projects. These new projects create strong Infrastructure-as-a-Service (IaaS) offerings, leveraging the network strengths of Mellanox, as well as improving the management of the cloud solutions via a partnership with Chef.

  • Project 1: Oregon State University will focus on cloud enablement, working to create an OpenPOWER stack environment to demonstrate Mellanox networking and cloud capabilities.
    · Project 2: The University will take an open technology approach to cloud, using Linux, OpenStack and KVM to create a platform environment managed by Chef in the university’s Open Source Lab.

As you can see, the work that is underway is impressive. Mellanox salutes each of the students involved and we look forward to hearing about their progress throughout the semester, and ultimately learning which student team is named “Best in Class” at the IBM InterConnect conference in February!

This challenge is just the beginning. Universities may become members of the OpenPOWER Foundation for free to take advantage of the industry momentum, engage in technical work groups and strategic initiatives, and more. To find out more, visit the OpenPOWER Foundation.

How the IBM-GLOBALFOUNDRIES Agreement Supports OpenPOWER’s Efforts

By Brad McCredie, President of OpenPOWER and IBM Fellow and Vice President of Power Development

On Monday IBM and GLOBALFOUNDRIES announced that they had signed a Definitive Agreement under which GLOBALFOUNDRIES plans to acquire IBM’s global commercial semiconductor technology business, including intellectual property, world-class technologists and technologies related to IBM Microelectronics, subject to completion of applicable regulatory reviews. From my perspective as both OpenPOWER Foundation President and IBM’s Vice President of Power Development, I’d like to share my thoughts with the extended OpenPOWER community on how this Agreement supports our collective efforts.

This Agreement, once closed, will enhance the growing OpenPOWER ecosystem consisting of both IBM and non-IBM branded POWER-based offerings. While of course our OpenPOWER partners retain an open choice of semiconductor manufacturing partners, IBM’s manufacturing base for our products will be built on a much larger capacity fab that should advantage potential customers.

IBM’s sharpened focus on fundamental semiconductor research, advanced design and development will lead to increased innovation that will benefit all OpenPOWER Foundation members. IBM will extend its global semiconductor research and design to advance differentiated systems leadership and innovation for a wide range of products including POWER based OpenPOWER offerings from our members. IBM continues its previously announced $3 billion investment over five years for semiconductor technology research to lead in the next generation of computing.

IBM remains committed to an extension of the open ecosystem using the POWER architecture; this Agreement does not alter IBM’s commitment to the OpenPOWER Foundation. This announcement is consistent with the goals of the OpenPOWER Foundation to enable systems developers to create more powerful, scalable and energy-efficient technology for next-generation data centers. The full stack — beginning at the chip and moving all the way to middleware software — will drive systems value in the future. IBM and the members of the OpenPOWER Foundation will continue to lead the challenge to extend the promise that Moore’s Law could not fulfill, offering end-to-end systems innovation through our robust collaboration model.

Today’s Agreement reaffirms IBM’s commitment to move towards world-class systems — both those offered by IBM and those built by our OpenPOWER partners that leverage POWER’s open architecture — that can handle the demands of new workloads and the unprecedented amount of data being generated. I look forward to our continued work together, as IBM extends its semiconductor research and design capabilities for open innovation for cloud, mobile, big data analytics, and secure transaction-optimized systems.

TYAN’s OpenPOWER Customer Reference System Now Available

Source: Tyan

Innovative, Collaborative and Open

Open resources, management flexibility, and hardware customization are becoming more important to IT experts across various industries. To meet the emerging needs of evolving IT worlds, TYAN is honored to present its Palmetto System, the TYAN GN70-BP010. As the first commercialized customer reference system provided from an official member from the OpenPOWER ecosystem, the TYAN GN70-BP710 is based POWER 8 Architecture and follows the OpenPOWER Foundation’s design concept.

The TYAN GN70-BP010 is a customer reference system which allows end users to deploy software based on the OpenPOWER architecture tailored to their individual requirements. It provides another opportunity for users to run their applications in a way of cost effective and flexible way. It is an innovative and collaborative hardware solution for IT experts who are looking for a more open, flexible, customized, and intelligent IT deployment.

TYAN GN70-BP010 Product Feature:

l   Enclosure

  • Industry 19″ rack-mountable 2U chassis
  • Dimension : D27.56″ x W16.93″ x H3.43“ (D700 x W430 x H87mm)
  • (8) 2.5” /3.5” hot-swap HDD

l   Power Supply

  • (1+1)770W DPS-770CB B (80-plus gold)

l   System Cooling

  • (4) 6cm hot-swap fans

l   Motherboard

  • SP010GM2NR , ATX 12” x 9.6” (304.8 x 235.2mm)

l   Processor

  • (1) IBM® Power 8 Turismo SCM processor

l   Memory

  • (4) 240-pin R-DDR3 1600/1333Mhz w ECC DIMM sockets

l   Expansion Slots

  • PCI-E Gen3 x16 slot
  • (1) PCI-E Gen3 x8 slot

l   Integrated LAN controllers

  • (2)  GbE ports (via  BMC5718)

l   Storage

  • (4) SATA -III 6.0Gb/s ports  (via Marvell 88SE9235)

l   Rear I/O

  • (2) GbE RJ45,
  • (1) Stacked dual-port  USB 3.0
  • (1) Stacked COM port  and VGA port
  • (1) FPIO (reboot/power on button/HDD LED/Power  ON LED)

l   AST2400  iBMC w/iKVM (IPMI v2.0 compliant)

For more Information or product availability to order, please contact:

World’s First OpenPOWER App Throwdown Showcases Five Strong ISV Innovations

By Terri Virnig, Vice President, Power Ecosystem and Strategy, IBM Systems & Technology Group

Last year, IBM and the founders of the OpenPOWER Foundation shook up the server industry when they announced IBM’s POWER8 chip would be open for cross-industry development.  Fast forward one year and the organization has grown 12x, with 61 co-collaborators on board and counting.  The strong attraction has likely stemmed from a shared belief that openness is key to innovation – that no one company alone can own the innovation agenda for an entire industry.

Independent software vendors (ISVs) historically have been quick to embrace openness, and this is no different.  ISVs are among the first to begin leveraging OpenPOWER’s development building blocks, including essential technical specifications and hundreds of thousands of firmware code.  As a result, ISVs are bringing forward several interesting new apps designed not just for IBM Power Systems running Linux, but also compliant with any future non-IBM, OpenPOWER based system or solution to come to market.

To further support this momentum, we’re pleased to announce today the world’s first OpenPOWER App Throwdown taking place at IBM’s Enterprise2014 conference in Las Vegas.   The contest builds upon the success of our Linux on Power App Throwdown, and will recognize some of the most innovative applications being developed to solve real business challenges.

After reviewing 21 fantastic submissions, the competition has been narrowed down to five exceptional finalists. All finalists have built apps that leverage POWER’s Big Data capabilities in an open environment, solving problems across healthcare, retail and more with solutions that can tackle a variety of growing business challenges in new ways. And will be able to decide the winner, by first viewing the finalist’s videos, and then voting on Twitter with the hashtag #IBMEnterpriseApp along with the finalist’s Twitter handle (see list below).

We want to thank all of the teams who submitted their contributions. Below are the finalists in the OpenPOWER App Throwdown:

  • Information Builders (@infobldrs) built WebFOCUS 8, a reporting application running on Linux on Power that evaluates performance of Power compared to x86 machines. The company also created OEM Workload for Power that shows which POWER8 architecture would fit best for customers based on workloads.
  • ARGOS Computer Systems (@ARGOS_Computers) runs its cognitive engine on Linux on POWER8 systems. The engine has demanding workloads, running cognitive agents in several virtual environments.  The agents make financial transactions, calls, purchases, and can even interact with Human Resources. POWER8 increases productivity of the engine by effectively doubling threads, powering more virtual agents.
  • Redis Labs (@RedisLabsInc) worked with IBM to port and optimize its open-source in-memory NoSQL database for flash.  The original Redis database took the world by storm, and by now porting the capability to the POWER8 platform, the solution has become signifcantly more cost efficient. The solution runs on POWER8 using CAPI flash, cutting deployment costs by 70 percent and achieving a 24-to-1 resource consolidation versus x86-based deployments.
  • Zato Health (@zatohealth) delivers its Interoperability Platform via Power Systems on Linux, enabling proactive personalized medicine by accessing electronic health records across clinical and genomic data silos, data centers and organizations. It uses natural language processing to determine diagnostic criteria to better tailor treatment, identify opportunities for early intervention, and detect potential insurance savings qualifications.
  • Zend Technologies (@zend) developed Zend Server, an application platform for PHP running on POWER8 which can significantly improve the performance of data analysis for a variety of applications.  One of its key features is Z-Ray, an analytics tool that evaluates application performance, giving insight into website data like event monitoring, database queries, execution and memory performance. Through the OpenPOWER App Throwdown, we can see firsthand how POWER’s open architecture is able to drive meaningful innovation.  No matter the winner, we are proud to work with all of these top-rate teams.

So, now is the time to cast your vote with your social media voice.  Tweet #IBMEnterpriseApp and the Twitter handle of the ISV you feel most deserves to win.  The winner will be announced at the IBM Enterprise2014 ISV/MSP Mashup.  Then, be on the lookout for a live tweet from the OpenPOWER Twitter handle (@OpenPOWEROrg) announcing the first winner of the OpenPOWER App Throwdown!

Opening Up in New Ways: How the OpenPOWER Foundation is Taking Open to New Places

By Jim Zemlin – August 25, 2014

It’s no secret that open development is the key to rapid and continuous technology innovation. Openly sharing knowledge, skills and technical building blocks is something that we in the Linux community have long been promoting and have recognized as a successful model for breeding technology breakthroughs. Much of The Linux Foundation’s and its peerss efforts to date have been centered on fostering openness at the software level, starting right at the source — the operating system – and building up from there. Traditionally, the agenda has not included a great amount of attention on how to open up at the hardware level. Until now.

A year ago, many of us in the Linux community took notice when IBM, NVIDIA, Mellanox, Tyan and Google announced their intentions to form the OpenPOWER Foundation, a group through which the IBM POWER processor architecture would be opened up for development. Now, one year later, the group has officially formed and the notion of open hardware development that starts at the processor level has resonated with many.

According to OpenPOWER, they now have 53 members and seven working groups focused on enabling broad industry innovation across the full hardware and software stack. Through the Foundation, member companies are free to use the POWER architecture for custom open servers and components for Linux based cloud data centers, or any processor application they choose.

Fostering open collaboration at all levels – from the chip and on up through the entire hardware and software stacks – is what is needed to drive a new era of innovation. To this end, The Linux Foundation looks forward to partnering with the OpenPOWER Foundation in the near future on projects in which we have a shared vision. In particular, we will aim to work together in ways that can address some of today’s largest technology challenges – like better harnessing Big Data, addressing security concerns and energy efficiency – in a way that unlocks opportunity for all.

So, with that, let me officially welcome the OpenPOWER Foundation to the community. We look forward to working together to drive open innovation in new ways and in new places.

Source: Linux Foundation

Unchaining the data center with OpenPOWER: Reengineering a server ecosystem

By Michael Gschwind, STSM & Senior Manager, System Architecture, IBM 33601413

Later today at HOT CHIPS a leading semiconductor conference, I will be providing an update on IBM’s POWER8 processor and how, through the OpenPOWER Foundation, we are making great strides opening the processor up not just from a hardware perspective, but also at the software level.

It was at this same show last year that my colleague IBM POWER hardware architect Jeff Stuecheli first revealed how POWER8 would be made open for development.  This move has been met with great excitement over the past twelve months and has been seen as an important milestone because, with the advent of Big Data, companies are demanding more from their data centers — more than what commodity servers built on decades old PC-era technology can deliver. POWER technology is designed specifically to meet these demands and, because it is open, it frees technology providers to innovate together and accelerate industry advancement.

Other than being a significant technical and open development milestone, POWER8 is also the basis for the OpenPOWER Foundation, an open technical organization formed by data center industry leaders that enables data center operators to rethink their approach to technology. In a world where there’s constant tension between the need for standardization and the need for innovation, OpenPOWER was created to foster an open ecosystem, using the POWER architecture to share expertise, investment, and server-class intellectual property to serve the evolving needs of customers.

OpenPOWER is about choice in large-scale data centers:

  • The choice to differentiate — Through the Foundation, members can build workload optimized solutions customized for servers and use best-of-breed-components from an open ecosystem, instead of settling for “one size fits all.” This will in turn increase value.
  • The choice to innovate — The OpenPOWER Foundation offers a collaborative environment where members can jointly create a vibrant open ecosystem for data centers.
  • The choice to grow — Each member of the Foundation can implement new capabilities instead of relying on technology scaling of a stagnant PC architecture that has run out of headroom to grow.

After all that has been accomplished through the OpenPOWER Foundation on the hardware side, today I want to share some new advances on the software side. First of all, I am happy to announce that The New OpenPOWER Application Binary Interface (ABI) has been published. The ABI is a collection of rules for the OpenPOWER Foundation with the scope of standardizing the inter-operation of application components. This is significant because, when programs are optimized by compilers, we can all be more efficient.

Second, the OpenPOWER Vector SIMD Programming Model has been implemented. This program transcends traditional hardware-centric SIMD programming models with the scope of creating intuitive programming models and facilitating application portability while enabling compilers to optimize OpenPOWER workloads even better.

These advancement were made possible through consultation with OpenPOWER members, and they will grant more room for bringing in innovation at all levels of the hardware and software stacks.

The OpenPOWER Foundation’s collaborative innovation is already changing the industry and major data center stakeholders are joining OpenPOWER. If you want to learn more about the OpenPOWER Foundation visit

Members can now request early access to Tyan reference board

Tyan reference Board We are excited with the progress that the OpenPOWER Foundation member companies have made since our public launch in San Francisco back in April. Members can now request early access to the Tyan reference board shown below by emailing Bernice Tsai at Tyan. This is a single socket, ATX form factor, Power8 Motherboard that can members can bring up an Debian Linux Distribution (Little Endian) to start innovating with. We look forward to seeing the great ideas that will be generated by working together!


Blog | IT powers new business models

Source: News Bytes

People and businesses today are rapidly adopting new technologies and devices that are transforming the way they interact with each other and their data.

This digital transformation generates 2.5 quintillion bytes of data associated with the proliferation of mobile devices, social media and cloud computing, and drives tremendous growth opportunity.

Canonical Supporting IBM POWER8 for Ubuntu Cloud, Big Data

Source: The VAR Guy

If Ubuntu Linux is to prove truly competitive in theOpenStack cloud and Big Data worlds, it needs to run on more than x86 hardware. And that’s whatCanonical achieved this month, with the announcement of full support for IBM POWER8machines on Ubuntu Cloud and Ubuntu Server.

As Computing Tasks Evolve, Infrastructure Must Adapt

Source: Forbes

The litany of computing buzzwords has been repeated so often that we’ve almost glazed over: mobile, social, cloud, crowd, big data, analytics.  After a while they almost lose their meaning.

Taken together, though, they describe the evolution of computing from its most recent incarnation — single user, sitting at a desk, typing on a keyboard, watching a screen, local machine doing all the work — to a much more amorphous activity that involves a whole new set of systems, relationships, and actions.

Making Power Open to the Enterprising Masses

Source: IBM Research News Blog

Since its development in the 1990s, IBM Power Systems served databases. They crunched big data for big business better than anyone else in the industry. But so these systems would support the boom of mobile and cloud computing – not to mention social media and its unstructured data ilk – IBM decided to open Power 8 technology up to the world via the OpenPOWER Foundation.

IBM is changing the server game

Source: Disk91

There was something I missed on the IBM strategy when they sold x86 branch to Lenovo. Since I read some articles about OpenPower and Google home made first power8 server, this strategy is making more sense.

Google Shows Off Hardware Design Using IBM Chips

Source: WSJ

It’s no secret that IBM wants to move its technology into the kind of data centers that Google GOOGL -0.47% and other Web giants operate. Now comes evidence that Google is putting some serious work into that possibility.