Performance monitoring and management of distributed systems.

Kleoni Ioannidou
UCSC Student

Scott Brandt
UCSC Instructor
Google funded project on information trust

Ian Pye
UCSC Student

Bo Adler
UCSC Student

Luca de Alfaro
UCSC Instructor

Shelly Spearing
Mentor

Jorge Roman
Mentor

Greg Levin
UCSC Student

Scott Brandt
UCSC Instructor
"Much of today's IT infrastructure, including high‐performance systems, suffers
from poorer and less predictable performance than necessary due to ineffective
resource management. While processor performance is increasing at a rapid rate,
increases in storage and memory performance are rather marginal, turning them
into serious bottlenecks, particularly for data‐intensive applications. At the same
time, memory and most storage subsystems operate in best‐effort mode without
even minimal performance guarantees.
We show that better and more predictable performance can be achieved by
considering system resource characteristics. Our work on disk scheduling shows
how this unpredictable resource can be effectively managed, and guaranteed, by
changing the metric by which it is managed. An analysis of memory performance
reveals a somewhat similar behavior to disk I/O with orders of magnitude
performance difference between best and worst case. Thus, we propose to manage
memory performance the same way, scheduling data‐intensive applications based
on their memory access pattern.
While analyzing system resource characteristics, we found that memory
performance scales with parallel accesses. This can be exploited to predictably
increase memory performance. We chose graphics processors with hundreds of
cores as an example of massively‐parallel architectures, and databases as an
example of data‐intensive applications to put the concept to the test. Our p‐ary
search algorithm successfully demonstrates how using parallel memory accesses
can yield predictable performance increases of up to 200%.

Tim Kaldewey
UCSC Student

Scott Brandt
UCSC Instructor
My research focuses on buffer‐‐cache management for predictable I/O performance
in local and distributed storage systems. The objective is to enable applications with
predictable performance requirements (such as hard, firm, and soft real‐time) to
have direct access to a storage system, and/or share it with applications with other
performance requirements. Our approach consists on virtualizing the performance
buffer‐cache, providing the illusion of a dedicated buffer‐cache to each application.

Roberto Pineiro
UCSC Student

Scott Brandt
UCSC Instructor
I am working on resource management algorithms for providing guaranteed
performance to mixed workloads executing concurrently in a shared storage
system.
Guaranteed I/O performance is needed for a variety of applications ranging from
real‐time data collection to desktop multimedia to large‐scale scientific simulations.
Reservations on throughput, the standard measure of disk performance, fail to
effectively manage disk performance due to the orders of magnitude difference
between best‐, average‐, and worst‐case response times, allowing reservation of less
than 0.01% of the achievable bandwidth. Moreover, hard I/O performance
guarantees for a mix of workloads are generally considered impractical due to the
stateful nature of disk I/O and the interference between workloads.
Our research provides I/O performance guarantees for a mix of workloads with
different performance and timeliness requirements without sacrificing
performance. This is achieved via a novel disk I/O scheduler, Fahrrad, that makes
hard performance guarantees in terms of disk time utilization. Building upon this
base scheduler, our work addresses several key questions in real‐time disk I/O: how
to provide isolation between concurrently executing request streams, the tradeoff
between the tightness of the guarantees and the overall performance of the system,
and how to use Fahrrad to make throughput and timeliness guarantees.

Anna Povzner
UCSC Student

Scott Brandt
UCSC Instructor

Joe Buck
UCSC Student

Noah Watkins
UCSC Student

Scott Brandt
UCSC Instructor

Carlos Maltzahn
UCSC Instructor

This project is an LLNL/UCSC collaboration: the goal is to design a scalable
metadata‐rich file system with database‐like data management services. With such a
file system scientist will be able to perform time‐critical analysis over continually
evolving, very large data sets.
In the first phase we designed and implemented QUASAR, a path‐based query
language using the POSIX IO data model extended by relational links. We conducted
a couple of data mining case studies where we compared the baseline architecture
consisting of a database and a file system with our MRFS prototype. The QUASAR
interface via its query language provides much easier access to large data sets than
POSIX IO. MRFS' querying performance is significantly better than the baseline
system due to QUASAR's hierarchical scoping.
Challenges remain and we are in the process of addressing them: we are working on
a scalable physical data model of QUASAR's logical data model, and we are designing
a rich‐metadata client cache to address small update overheads and metadata
coherence.
The work so far was presented at SOSP'07 (demo & poster), PDSW'07 (poster),
PDSW'08 (poster), and reported in two 2008 tech reports available at the
UCSC/PDSI site.

Sasha Ames
UCSC Student

Carlos Maltzahn
UCSC Instructor
I am interested in CPU virtualization. I am working on enabling more flexibility and
control of CPU scheduling by separating user space and kernel space scheduling.

Scott Brandt
UCSC Instructor