HECURA I/O Projects

  1. HaRD: The Wisconsin Hierarchically-Redundant, Decoupled Storage Project
  2. CRAM: A Congestion-Aware Resource and Allocation Manager for Data-Intensive High-Performance Computing
  3. Active Object Storage to Enable Scalable and Reliable Parallel File System
  4. An Application Driven I/O Optimization Approach for PetaScale Systems and Scientific Discoveries
  5. EAGER: Autonomous Data Partitioning Using Data Mining for High End Computing
  6. RUI: Automatic Identification of I/O Bottleneck and Run-time Optimization for Cluster Virtualization
  7. Adaptive Techniques for Achieving End-to-End QoS in the I/O Stack on Petascale Multiprocessors
  8. Optimization Algorithms for Large-scale, Thermal-aware Storage Systems
  9. Multidimensional and String Indexes for Streaming Data
  10. Cross-Layer Exploration of Non-Volatile Solid-State Memories to Achieve Effective I/O Stack for High-Performance Computing Systems
  11. Visual Characterization of I/O System Behavior for High-End Computing
  12. Automatic Extraction of Parallel I/O Benchmarks from HEC Applications
  13. Secure Provenance in High-End Computing Systems
  14. Scalable Data Management Using Metadata and Provenance
  15. Streamlining High-End Computing with Software Persistent
  16. Interleaving Workloads with Performance Guarantees on Storage Cluster
  17. Programming Models and Storage System for High Performance Computation with Many-Core Processors
  18. A Dynamic Application-specific I/O Architecture for High End Computing
  19. Balanced Scalable Architectures for Data-Intensive Supercomputing
  20. Performance- and Energy-Aware HEC Storage Stacks
  21. QoS-driven Storage Management for High-end Computing Systems
  22. A New Semantic-Aware Metadata Organization for Improved File-System Performance and Functionality in High-End Computing

HaRD: The Wisconsin Hierarchically-Redundant, Decoupled Storage Project

The Wisconsin Hierarchically-Redundant, Decoupled storage project (HaRD) investigates the next generation of storage software for hybrid Flash/disk storage clusters. The main objective of the project is to improve the performance of storage in a variety of diverse scenarios, including new application environments such as photo storage as found in Facebook and Flickr, high-end scientific processing as found in government labs, and large-scale data processing such as that found in Google and Microsoft. The HaRD project focuses on three key issues in order to improve performance of these important applications: client-side Flash-based RAID and file-system integration, server-side memory reduction and multicore scheduling of file-system tasks, and scheduled network transfers. HaRD pulls together these technologies into a synthesized whole through three targeted storage systems: a scalable photo server, a high-performance checkpoint subsystem, and an improved file system for MapReduce workloads. The impact of this project is significant, as HaRD helps to shape the storage software architecture of the next generation of cloud computing services, which are of increasing relevance to both industry and society at large.


CRAM: A Congestion-Aware Resource and Allocation Manager for Data-Intensive High-Performance Computing

This project will develop a job scheduling and resource allocation system for data-intensive high-performance computing (HPC) based on the congestion pricing of a systems' heterogeneous resources. This extends the concept of resource management beyond processing: it allocates memory, disk I/O, and the network among jobs. The research will overcome the critical shortcomings of processor-centric resource management, which wastes huge portions of cluster and supercomputer resources for data-intensive workloads, e.g. I/O bandwidth governs the performance of many modern HPC applications but, at present, it is neither allocated nor managed. The research will develop techniques that (1) reconfigure the degree of parallelism of HPC jobs to avoid congestion and wastage, (2) support lower-priority, allocation elastic jobs that can be scheduled on arbitrary numbers of nodes to consume unallocated resource fragments, and (3) co-schedule batch-processing workloads that use system resources that are unoccupied due to asymmetric utilization and temporal shifts in the foreground jobs. These techniques will be implemented and supported for free public use as extensions to an open-source resource-management framework. If used broadly, the software has the potential to provide much better utilization of the national investment in HPC facilities.


Active Object Storage to Enable Scalable and Reliable Parallel File System

The increasing performance and decreasing cost of processors has enabled increased system intelligence at peripherals such as disk drives. This computational capability at the disk has led to the development of object-based storage whereby some of the file system functionality is moved to the disk. The computation capability can also enable computation at the storage node in what has been called active disks or active storage. This active storage computation serves as a mechanism to enable parallel computation using distributed storage nodes.

This research focuses on the use of these active disks for parallel file system and storage management. A functional active storage system architecture built on the standardized object-storage device specification is being developed. The architecture supports a variety of execution engines allowing multiple programming languages and models. Using this active object storage architecture, mechanisms to improve overall scalability and large-scale system reliability are being investigated. In addition, active and object storage are used to enable customizable and extensible file systems including autonomic (self-configuring and self-managing) storage as well as application aware storage such that the storage can be optimized for application and user needs.


An Application Driven I/O Optimization Approach for PetaScale Systems and Scientific Discoveries


This research focuses on developing scalable parallel file access methods for multi-scale problem domain decompositions, such as the one presented in Adaptive Mesh Refinement (AMR) based algorithms. Existing parallel I/O methods concentrate on optimizing the process collaboration under a fairly evenly-distributed request pattern. However, they are not suitable for data structures in AMR, because the underlying data distribution is highly irregular and dynamic. Process synchronization in the existing parallel I/O methods can penalize the I/O parallelism if the process collaboration is not carefully coordinated. This research addresses such synchronization issue by developing scalable solutions in the Parallel netCDF library (PnetCDF), particularly to address AMR structured data and its I/O patterns. PnetCDF is a popular I/O library used by many computer simulation communities. A scalable solution for storing and accessing AMR data in parallel is considered a challenging task. This research will design a process-group based parallel I/O approach to eliminate unrelated processes and thus avoid possible I/O serialization. In addition, a new metadata representation will also be developed in pnetCDF for conserving tree-structured AMR data relationship in a portable form.


EAGER: Autonomous Data Partitioning Using Data Mining for High End Computing

Query response time and system throughput are the most important metrics when it comes to database and file access performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time and system throughput. One of the common ways to reduce disk I/Os and therefore improve query response time is database clustering, which is a process that partitions the database/file vertically (attribute clustering) and/or horizontally (record clustering). To take advantage of parallelism to improve system throughput, clusters can be placed on different nodes in a cluster machine.

This project develops a novel algorithm, AutoClust, for database/file clustering that dynamically and automatically generates attribute and record clusters based on closed item sets mined from the attributes and records sets found in the queries running against the database/files. The algorithm is capable of re-clustering the database/file in order to continue achieving good system performance despite changes in the data and/or query sets. The project then develops innovative ways to implement AutoClust using the cluster computing paradigm to reduce query response time and system throughput even further through parallelism and data redundancy. The algorithms are prototyped on a Dell Linux Cluster computer with 486 compute nodes available at the University of Oklahoma. For broader impacts, performance studies are conducted using not only the decision support system database benchmark (TPC-H) but also real data recorded in database and file formats collected from science and healthcare applications in collaboration with domain experts, including scientists at the Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma. The project also makes important impacts on education as it provides training for graduate and undergraduate students working on this project in the areas of national critical needs: database and file management systems, and high-end computing and applications. The developed algorithm and prototype, real datasets and performance evaluation results are made available to the public at the Website: http://www.cs.ou.edu/~database/AutoClust.html .


RUI: Automatic Identification of I/O Bottleneck and Run-time Optimization for Cluster Virtualization

Extending virtualization technology into high-performance, cluster platforms generates exciting new possibilities. However, I/O efficiency in virtualized environments, specifically with respect to disk I/O, remains little understood and hardly tested.

The objective of this research is to investigate fundamental techniques for virtual clusters that not only facilitate rigorous performance studies, but also identify places where performance is suffering and then optimize the system to lessen the impact of such bottlenecks. To accomplish this objective, the following research tasks will be conducted: 1) An in-depth analysis of I/O efficiency in virtualized environments and investigation of intelligent and automated I/O bottleneck identification schemes; 2) Design and development of techniques to optimize I/O to address the detected I/O bottlenecks; 3) Development of an extensible framework for characterizing I/O workloads across virtualized clusters.

This research will greatly contribute to understanding virtualized I/O, identifying I/O bottlenecks and optimizing I/O, and thus facilitate the cluster systems to most effectively utilize virtualization technology. This project will also contribute to the society through promoting research and engaging under-represented groups that leads students to advancing their careers in science and engineering.


Adaptive Techniques for Achieving End-to-End QoS in the I/O Stack on Petascale Multiprocessors


Emerging high-end computing platforms, such as leadership-class machines at the petascale, provide new horizons for complex modeling and large-scale simulations. These machines are used to execute data intensive applications of national interest such as climate modeling, cosmic microwave background radiation, and astrophysical thermonuclear flashes. While these systems have unprecedented levels of peak computational power and storage capacity, a critical challenge concerns the design and implementation of scalable I/O (input-output) system software (also called I/O stack) that makes it possible to harness the power of these systems for scientific discovery and engineering design. Unfortunately, currently, there are no available mechanisms that accommodate I/O stack-wide, application-level QoS (quality-of-service) specification, monitoring, and management.

This project investigates a revolutionary approach to the QoS-aware management of the I/O stack using feedback control theory, machine learning, and optimization. The goal is to maximize I/O performance and thus improve overall performance of large scale applications of national interest. The project uses (1) machine learning and optimization to determine the best decomposition of application-level QoS to sub-QoSs targeting individual resources, and (2) feedback control theory to allocate shared resources managed by the I/O stack such that the specified QoSs are satisfied throughout the execution. The project tests the developed I/O stack enhancements using the workloads at NCAR, LBNL and ANL systems. It also involves two efforts in broadening participation: CISE Visit in Engineering Weekends (VIEW) and NASA-Aerospace Education Services Project (NASA-AESP) at the Center for Science and the Schools (CSATS).



Optimization Algorithms for Large-scale, Thermal-aware Storage Systems