Invited Keynote Lecture 1
 |
Towards Reconfigurable High Performance Computing based on Co-Design Concept
[Lecture Slide]
Dr. Taisuke Boku,
2011 Gordon Bell Prize (Sustained Performance Prize)
Deputy Director, Professor Center for Computational Sciences, University of Tsukuba, Japan
|
Biography
Prof. Taisuke Boku received Master and PhD degrees from Department of
Electrical Engineering at Keio University. After his carrier as
assistant professor in Department of Physics at Keio University, he
joined to Center for Computational Sciences (former Center for
Computational Physics) at University of Tsukuba where he is currently
the deputy director, the HPC division leader and the system manager of
supercomputing resources. He has been working there more than 20
years for HPC system architecture, system software, and performance
evaluation on various scientific applications. In these years, he has
been playing the central role of system development on CP-PACS (ranked
as number one in TOP500 in 1996), FIRST (hybrid cluster with gravity
accelerator), PACS-CS (bandwidth-aware cluster) and HA-PACS
(high-density GPU cluster) as the representative supercomputers in
Japan. He also contributed to the system design of K Computer as a
member of architecture design working group in RIKEN and currently a
member of operation advisory board of AICS, RIKEN. He received ACM
Gordon Bell Prize in 2011. His recent research interests include
accelerated HPC systems and direct communication hardware/software for
accelerators in HPC systems based on FPGA technology.
Lecture Summary
FPGA and reconfigurable hardware system has been researched as an
effective solution for HPC (High Performance Computing) systems,
mainly focusing on its computation capability and easy-to-design
feature to fit the computation to its application
characteristics. However, recent advanced technology on commodity CPU
strongly pushes up the circuit frequency and absolute performance of
floating point operation throughput based on SIMD (single instruction
/ multiple data streams) instructions and multi-core implementation on
a chip while the frequency of FPGA grows relatively slowly. The
approach to utilize FPGA just for computation does not have a great
advantage as like as traditional works. One of the keywords in HPC
fields today is "codesign" where the application request and hardware
limitation must have a certain middle ground to share for the best
coupling and balance between them under the limitation of power
consumption. I strongly believe that the feature of FPGA should play
important role in this new concept of HPC world, with its strong
reconfigurability and flexibility of circuit utilization.
On the other hand, for the absolute computing performance in HPC, we
need to parallelize everything in the system. To avoid the performance
bottleneck exists on the data path between all components, we need a
strong connectivity among them. Most of current HPC components such as
CPU, GPU, accelerators, network interfaces and storage drives are
connected by PCI Express today. In other words, "PCI Express rules
everything." In University of Tsukuba and other collaborators have
been focusing on the importance of PCI Express which can be used not
just for commodity data path between components within a computation
node but also for them crossing node border. For the solution on this
data path problem, we are applying FPGA technology with a flexible
programmability of the circuit as well as its strong IP library for
various processing components including data interfaces.
In this talk, I present the effective FPGA utilization to the field of
HPC which is facing serious performance issues, based on the concept
of hardware/software codesign for the next generation of HPC
systems. FPGA can be used both on effective computing component and
communication facility to be an answer for today's HPC problems.
|
Invited Keynote Lecture 2
 |
Micron¡Çs Automata Processor Architecture: Reconfigurable and Massively Parallel Automata Processing
[Lecture Slide]
Mr. Harold Noyes,
Senior Architect
DRAM Solutions Group, Micron Technology,
USA
|
Biography
Harold Noyes joined Micron Technology's DRAM Solutions Group in 2007 as the senior architect (hardware),
working on the Automata Processor investigation and development. Prior to joining Micron, Mr. Noyes held
a variety of research and development positions with Hewlett-Packard Company (25 years) and
Freescale Semiconductor (2 years). His experience spans both engineering and project management roles,
including Automata Processor architecture development, printed circuit assembly design, electromagnetic
and regulatory compliance, modem design, full custom silicon design, ASIC design, and technical writing.
Mr. Noyes earned a B.S. in electrical engineering from Utah State University.
Lecture Summary
Frequency scaling and architectural enhancements traditionally provided by Moore's Law are no longer adequately addressing computationally intensive problems.
Multicore processing architectures, capable of increasing performance in certain applications, fall short when it comes to unstructured data set processing and algorithms that are not easily modified for parallel execution.
Reconfigurable silicon architectures, with purpose-built machines controlled by traditional CPUs, take a somewhat similar approach but are limited by the practical considerations of power, cost, size, and speed.
Micron's Automata Processor architecture overcomes many of the obstacles facing modern von Neumann architectures and is poised to play an important role in solving some of the most challenging computational problems.
It uses memory-based reconfigurable technology to create purpose-built, data-driven machines called automata.
Inherent to the architecture is the ability to operate all of these automata—thousands or even millions— completely in parallel.
This presentation will cover the Automata Processor architecture and programming model, as well as the software development kit.
Examples of possible applications will also be presented, along with the associated cost, power, and performance improvements that are anticipated.
|
Invited Keynote Lecture 3
 |
Towards a Scalable and Configurable Accelerator
[Lecture Slide]
Dr. Simon See,
Director and Chief Solution Architect
Nvidia Inc. Asia Pacific,
Singapore
|
Biography
Dr Simon See is currently the High Performance Computing Technology Director and Chief Solution Architect for Nvidia Inc, Asia and also an Professor and Chief Scientific Computing officer in Shanghai Jiao Tong University .
Concurrently A/Prof See will also be the chief scientific computing advisor for BGI (China).
His research interest is in the area of High Performance Computing, computational science, Applied Mathematics and simulation methodology.
He has published over 100 papers in these areas and have won various awards.
Dr. See graduated from University of Salford (UK) with a Ph.D. in electrical engineering and numerical analysis in 1993.
Prior to joining Sun, Dr See worked for SGI, DSO National Lab. of Singapore, IBM and International Simulation Ltd (UK), Sun Microsystems and Oracle.
He is also providing consultancy to a number of national research and supercomputing centers.
Lecture Summary
In the last few years, high performance computing community have been increasing adopting accelerator such as GPU. Accelerator allows one to scale up and increase performance within a certain power envelope. However in order to scale to large systems and address different type of applications, one has to design systems that configurable and scale. In this talk, the author will discuss some of the ideas of the next generation GPU.
|
|