Los Alamos National Laboratory
 
 

Science >  LANL Institutes >  Information Science and Technology Institute

National Security Education Center

Contacts

2013 Presentations
2012 Posters
2012 Presentations
2011 Posters
2011 Presentations
2010 Posters
2010 Panels
2010 Presentations
2009 Posters
2009 Presentations
2008 Posters
2008 Presentations
2007 Posters
2007 Presentations

    2012 Posters

  • ACMDEV12: Exploring employment opportunities through microtasks via cybercafes
    Rajan Vaish, James Davis

    Microwork in cybercafés is a promising tool for poverty alleviation. For those who cannot afford a computer, cybercafés can serve as a simple payment channel and as a platform to work. However, there are questions about whether workers are interested in working in cybercafés, whether cybercafé owners are willing to host such a set up, and whether workers are skilled enough to earn an acceptable pay rate? We designed experiments in internet/cyber cafes in India and Kenya to investigate these issues. We also investigated whether computers make workers more productive than mobile platforms? In surveys, we found that 99% of the users wanted to continue with the experiment in cybercafé, while 8 of 9 cybercafé owners showed interest to host this experiment. User typing speed was adequate to earn a pay rate comparable to their existing wages, and the fastest workers were approximately twice as productive using a computer platform.

  • 2012 Presentations

  • DISCS'12: A Plugin for HDF5 using PLFS for Improved I/O Performance and Semantic Analysis
    John Bent, Gary Grider, Aaron Torres, Kshitij Mehta, Edgar Gabriel
  • ACMDEV12: Exploring employment opportunities through microtasks via cybercafes
    Rajan Vaish, James Davis

    Microwork in cybercafés is a promising tool for poverty alleviation. For those who cannot afford a computer, cybercafés can serve as a simple payment channel and as a platform to work. However, there are questions about whether workers are interested in working in cybercafés, whether cybercafé owners are willing to host such a set up, and whether workers are skilled enough to earn an acceptable pay rate? We designed experiments in internet/cyber cafes in India and Kenya to investigate these issues. We also investigated whether computers make workers more productive than mobile platforms? In surveys, we found that 99% of the users wanted to continue with the experiment in cybercafé, while 8 of 9 cybercafé owners showed interest to host this experiment. User typing speed was adequate to earn a pay rate comparable to their existing wages, and the fastest workers were approximately twice as productive using a computer platform.

  • GHTC12: Exploring employment opportunities through microtasks via cybercafes
    Rajan Vaish, James Davis

    Microwork in cybercafés is a promising tool for poverty alleviation. For those who cannot afford a computer, cybercafés can serve as a simple payment channel and as a platform to work. However, there are questions about whether workers are interested in working in cybercafés, whether cybercafé owners are willing to host such a set up, and whether workers are skilled enough to earn an acceptable pay rate? We designed experiments in internet/cyber cafes in India and Kenya to investigate these issues. We also investigated whether computers make workers more productive than mobile platforms? In surveys, we found that 99% of the users wanted to continue with the experiment in cybercafé, while 8 of 9 cybercafé owners showed interest to host this experiment. User typing speed was adequate to earn a pay rate comparable to their existing wages, and the fastest workers were approximately twice as productive using a computer platform.

  • Greenplum—SciHadoop: Array-based Query Processing in Hadoop
    Joe Buck

    Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop’s byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce Sci-Hadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a Sci-Hadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.

  • Microsoft Research—SciHadoop: Array-based Query Processing in Hadoop
    Joe Buck

    Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop’s byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce Sci-Hadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a Sci-Hadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.

  • MSST2012—On the Role of Burst Buffers in Leadership-Class Storage Systems
    Gary Grider, Adam Crume, Carlos Maltzahn
  • MSST2012—Valmar: High-Bandwidth Real-Time Streaming Data Management
    John Bent, HB Chen, David Bigelow, Scott Brandt
  • PacificVis2012—Analyzing the Evolution of Large Scale Structures in the Universe with Velocity Based Methods
    James Ahrens, Eddy Chandra, Uliana Popov, Alex Pang
  • PDSW'12: Discovering Structure in Unstructured I/O
    John Bent, Gary Grider, Aaron Torres, Jun He, Garth Gibson, Carlos Maltzahn, Xian-He Sun
  • SIGIR12: Summarizing Highly Structured Documents for Effective Search Interaction
    Lanbo Zhang, Yi Zhang

    2011 Posters

  • 2011 ISSDM Day—Computer Vision using Human Computation
    James Davis

    Computer-mediated human micro-labor markets allow human effort to be treated as a programmatic function call. We can characterize these platforms as Human Processing Units (HPU). HPU computation can be more accurate than complex CPU based algorithms on some important computer vision tasks. We also argue that HPU computation can be cheaper than state-of-the-art CPU based computation. I'll give some examples of simple computer vision tasks that we have evaluated, and speculate on whether a finite computer vision instruction set is possible. The instruction set would allow most computer vision problems to be coded from the base instructions, and the instructions themselves would be made robust with the help of human computation.

  • 2011 ISSDM Day—Crowdsight: Rapidly Prototyping Visual Processing Apps
    Reid Porter, Mario Rodriguez, James Davis

    We describe a framework for rapidly prototyping applications which require intelligent visual processing, but for which reliable algorithms do not yet exist, or for which engineering those algorithms is too costly. The framework, CrowdSight, leverages the power of crowdsourcing to offload intelligent processing to humans, and enables new applications to be built quickly and cheaply, affording system builders the opportunity to validate a concept before committing significant time or capital. Our service accepts requests from users either via email or simple mobile applications, and handles all the communication with a backend human computation platform. We build redundant requests and data aggregation into the system, freeing the user from managing these requirements. We validate our framework by building several test applications and verifying that prototypes can be built more easily and quickly than would be the case without the framework.

  • 2011 ISSDM Day—Crowdsight: Rapidly Prototyping Visual Processing Apps
    John Galbraith, Janelle Yong, Don Wiberg
  • 2011 ISSDM Day—Divergent Physical Design Tuning
    John Bent, Gary Grider, Michael Lang, Meghan McClelland, James Nunez, Jeff LeFevre, Scott Brandt, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis

    We introduce a new method for tuning the physical design of replicated databases. Our method installs a different (divergent) index configuration to each replica, thus specializing replicas for different subsets of the database workload. We analyze the space of divergent designs and show that there is a tension between the specialization of each replica and the ability to load-balance the database workload across different replicas. Based on our analysis, we develop an algorithm to compute good divergent designs that can balance this trade-off. Experimental results demonstrate the efficacy of our approach.

  • 2011 ISSDM Day—Filtering Semi-Structured Documents Based on Faceted Feedback
    Carla Kuiken, Lanbo Zhang, Yi Zhang

    Existing adaptive filtering systems learn user profiles based on users' relevance judgments on documents. In some cases, users have some prior knowledge about what features are important for a document to be relevant. For example, a Spanish speaker may only want news written in Spanish, and thus a relevant document should contain the feature "Language: Spanish"; a researcher working on HIV knows an article with the medical subject "MeSH: AIDS" is very likely to be interesting to him/her.

    Semi-structured documents with rich faceted metadata are increasingly prevalent over the Internet. Motivated by the commonly used faceted search interface in e-commerce, we study whether users' prior knowledge about faceted features could be exploited for filtering semi-structured documents. We envision two faceted feedback solicitation mechanisms, and propose a novel user profile-learning algorithm that can incorporate user feedback on features. To evaluate the proposed work, we use two data sets from the TREC filtering track, and conduct a user study on Amazon Mechanical Turk. Our experimental results show that user feedback on faceted features is useful for filtering. The new user profile learning algorithm can effectively learn from user feedback on faceted features and performs better than several other methods adapted from the feature-based feedback techniques proposed for retrieval and text classification tasks in previous work.

  • 2011 ISSDM Day—FLAMBES: Evolving Fast Performance Models
    John Bent, Stephan Eidenbenz, Meghan McClelland, Adam Crume, Carlos Maltzahn, Neoklis Polyzotis, Manfred Warmuth

    Large clusters and supercomputers are simulated to aid in design. Many devices, such as hard drives, are slow to simulate. Our approach is to use a genetic algorithm to fit parameters for an analytical model of a device. Fitting focuses on aggregate accuracy rather than request-level accuracy since individual request times are irrelevant in large simulations. The model is fitted to traces from a physical device or a known device-accurate model. This is done once, offline, before running the simulation. Execution of the model is fast, since it only requires a modest amount of floating point math and no event queuing. Only a few floating-point numbers are needed for state. Compared to an event-driven model, this trades a little accuracy for a large gain in performance.

  • 2011 ISSDM Day—Halo Finder vs Local Extractors: Similarities and Differences
    Christopher Brislawn, Uliana Popov, Alex Pang

    Multi-streaming events are of great interest to astrophysics because they are associated with the formation of largescale structures (LSS) such as halos, filaments and sheets. Until recently, these events were studied using scalar density field only. In this talk, we present a new approach that takes into account the velocity field information in finding these multistreaming events. Six different velocity based feature extractors are defined, and their findings are compared to a halo finder results.

  • 2011 ISSDM Day—Insertion-optimized File System
    Christine Ahrens, Michael Lang, Latchesar Ionkov, Scott Brandt, Carlos Maltzahn

    Gostor is an experimental platform for testing new file storage ideas for post POSIX usage. Gostor provides greater flexibility for manipulating the data within the file, including inserting and deleting data anywhere in the file, creating and removing holes in the data, etc. Each modification of the data creates a new file. Gostor doesn't implement any ways of organizing the files in hierarchical structures, or mapping them to strings. Thus Gostor can be used to implement standard file systems as well as experimenting with new ways of storing and accessing users' data.

  • 2011 ISSDM Day—Managing High-Bandwidth Real-Time Data Storage
    John Bent, HB Chen, Gary Grider, Meghan McClelland, James Nunez, David Bigelow, Scott Brandt

    In an information-driven world, the ability to capture and store data in real-time is of the utmost importance. The scope and intent of such data capture, however, varies widely. Individuals record television programs for later viewing, governments maintain vast sensor networks to warn against calamity, scientists conduct experiments requiring immense data collection, and automated monitoring tools supervise a host of processes which human hands rarely touch. All such tasks have the same basic requirements -- guaranteed capture of streaming real-time data -- but with greatly differing parameters. Our ability to process and interpret data has grown faster than our ability store and manage it, which has led to the curious condition of being able to recognize the importance of data without being able to store it, and hence unable to later profit by it. 3
    Traditional storage mechanisms are not well suited to manage this type of data and we have developed a large-scale ring buffer storage architecture to handle it. Our system is well suited to both large and small data elements, has a native indexing mechanism, and can maintain reliability in the face of hardware failure. Strong performance guarantees can be made and kept, and quality of service requirements maintained.

  • 2011 ISSDM Day—Push-based Processing of Scientific Data
    John Bent, Meghan McClelland, James Nunez, Noah Watkins, Scott Brandt, Carlos Maltzahn

    Large-scale scientific data is collected through experiment and produced by simulation. This data in turn is commonly interrogated using ad-hoc analysis queries, and visualized with differing interactivity requirements. At extreme scale this data can be too large to store multiple copies, or may be easily accessible for only a short period of time. In either case, multiple consumers must be able to interact with the data. Unfortunately, as the number of concurrent users accessing storage media increases the throughput can decrease significantly. This performance degradation is due to the induced random access pattern that results from uncoordinated I/O streams. One common approach to this problem is to use collective I/O, unfortunately this is difficult to do for many independent computations. We are investigating a data centric, push-based approach inspired by work within the database community that has achieved an order of magnitude increase in throughput for concurrent query processing. A push-based approach to query processing uses a single data stream originating off of storage media rather than allowing multiple requests to compete, and utilizes work and data sharing opportunities exposed through query semantics.  There are many challenges that exist in this work, notably supporting a distributed execution environment, providing a mix of access performance requirements (throughput vs. latency), and support for multiple data models including relational and arraybased.

  • 2011 ISSDM Day—QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language
    Sasha Ames, Maya Gokhale

    File system metadata management has become a bottleneck for many data-intensive applications that rely on highperformance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's highperformance file systems store 7 to 9 orders of magnitude more data, resulting in numbers of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data.

    Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems.

    To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads.

  • 2011 ISSDM Day—Rad-Flows: Buffering for Predictable Communications
    Kleoni Ioannidou, Scott Brandt

    Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well understood, but the interaction of communicating tasks with different timing characteristics is less well understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.

  • 2011 ISSDM Day—RAID4S: Supercharging RAID Small Writes with SSD
    John Bent, Gary Grider, Meghan McClelland, James Nunez, Rosie Wacha, Scott Brandt, Carlos Maltzahn

    Parity-based RAID techniques improve data reliability and availability, but at a significant performance cost, especially for small writes. Flash-based solid state drives (SSDs) provide faster random I/O and use less power than hard drives, but are too expensive to substitute for all of the drives in most large-scale storage systems. We present RAID4S, a costeffective, high-performance technique for improving RAID small-write performance using SSDs for parity storage in a diskbased RAID array. Our results show that a 4HDD+1SSD RAID4S array achieves throughputs 3.3X better than a similar 4+1 RAID4 array and 1.75X better than a 4+1 RAID5 array on small-write-intensive workloads. RAID4S has no performance penalty on disk workloads consisting of up to 90% reads and its benefits are enhanced by the effects of file systems and caches.

  • 2011 ISSDM Day—SciHadoop: Array-based Query Processing in Hadoop
    James Ahrens, John Bent, Gary Grider, Michael Lang, Joe Buck, Scott Brandt, Maya Gokhale, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis, Wang-Chiew Tan

    Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over arraybased data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.

  • 2011 ISSDM Day—The Stanford-UC Santa Cruz Project for Cooperative Computing with Algorithms, Data and People
    Neoklis Polyzotis

    This talk will provide an overview of the SCOOP project, whose broad theme is to leverage people as processing units to achieve some global objective. A primary focus of SCOOP is to optimize the usage of human computation in order to use as few resources (e.g., time, money) as possible while maximizing the quality of the final output. Our approach is based on the principle of declarative languages that has been applied very successfully in database systems. The talk will describe the main research thrusts in SCOOP and some of our recent accomplishments.

  • 2011 ISSDM Day—Using reputation systems to increase content quality: lessons from Wikipedia and Google Maps
    Luca de Alfaro
  • 2011 ISSDM Day—Viral Genomics and the Semantic Web
    Carla Kuiken

    The “HIV database and analysis platform” has been maintained in Los Alamos for 22 years, and has grown to be an internationally renowned resource for HIV data analysis. It is in the process of expanding to include hepatitis C virus and hemorrhagic fever viruses; the eventual goal is to make it a universal viral resource. This expansion necessitates much greater reliance on external data and information sources. These resources rarely use the same identifies and frequently contain annotator- and submitter-specific language. While efforts have been underway for some time to standardize and cross-link biological information on the web, there still is a long way to go. I will describe current status of the “Viral data analysis platform”, the (semantic) problems we have grappled with, and the local and global efforts at amelioration.

  • 2011 ISSDM Day—Viral Genomics and the Semantic Web
    Tracy Holsclaw

    Gaussian process (GP) models provide non-parametric methods to fit continuous curves observed with noise.
    Motivated by our investigation of dark energy, we develop a GP-based inverse method that allows for the direct estimation of the derivative of a curve. In principle, a GP model may be fit to the data directly, with the derivatives obtained by means of differentiation of the correlation function. However, it is known that this approach can be inadequate due to loss of information when differentiating. We present a new method of obtaining the derivative process by viewing this procedure as an inverse problem. We use the properties of a GP to obtain a computationally efficient fit. We illustrate our method with simulated data as well as apply it to our cosmological application.

  • 2011 Mini Showcase—A Push-based Approach to Array Query Processing
    John Bent, Gary Grider, Meghan McClelland, Noah Watkins
  • 2011 Mini Showcase—RAID4S: Improving RAID Performance with Solid State Drives
    John Bent, Meghan McClelland, Rosie Wacha
  • 2011 Mini Showcase—SciHadoop, Array-based Query Processing in Hadoop
    John Bent, Meghan McClelland, Joe Buck
  • ASCR11—DAMASC Adding Data Management Services to Parallel File Systems
    James Ahrens, John Bent, Gary Grider, Meghan McClelland, James Nunez, Kleoni Ioannidou, Joe Buck, Noah Watkins, Scott Brandt, Maya Gokhale, Carlos Maltzahn, Neoklis Polyzotis, Wang-Chiew Tan
  • CIKM11—On Bias Problem in Relevance Feedback
    Lanbo Zhang, Yi Zhang
  • HEC FSIO 11—Push-based Approach to Scientific Data Processing
    John Bent, Meghan McClelland, Noah Watkins, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis
  • SC11—FLAMBES: Evolving Fast Performance Models
    John Bent, Stephan Eidenbenz, Meghan McClelland, Adam Crume, Carlos Maltzahn
  • 2011 Presentations

  • 2011 ISSDM Day—Computer Vision using Human Computation
    James Davis

    Computer-mediated human micro-labor markets allow human effort to be treated as a programmatic function call. We can characterize these platforms as Human Processing Units (HPU). HPU computation can be more accurate than complex CPU based algorithms on some important computer vision tasks. We also argue that HPU computation can be cheaper than state-of-the-art CPU based computation. I'll give some examples of simple computer vision tasks that we have evaluated, and speculate on whether a finite computer vision instruction set is possible. The instruction set would allow most computer vision problems to be coded from the base instructions, and the instructions themselves would be made robust with the help of human computation.

  • 2011 ISSDM Day—Crowdsight: Rapidly Prototyping Visual Processing Apps
    Reid Porter, Mario Rodriguez, James Davis

    We describe a framework for rapidly prototyping applications which require intelligent visual processing, but for which reliable algorithms do not yet exist, or for which engineering those algorithms is too costly. The framework, CrowdSight, leverages the power of crowdsourcing to offload intelligent processing to humans, and enables new applications to be built quickly and cheaply, affording system builders the opportunity to validate a concept before committing significant time or capital. Our service accepts requests from users either via email or simple mobile applications, and handles all the communication with a backend human computation platform. We build redundant requests and data aggregation into the system, freeing the user from managing these requirements. We validate our framework by building several test applications and verifying that prototypes can be built more easily and quickly than would be the case without the framework.

  • 2011 ISSDM Day—Divergent Physical Design Tuning
    John Bent, Gary Grider, Michael Lang, Meghan McClelland, James Nunez, Jeff LeFevre, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis

    We introduce a new method for tuning the physical design of replicated databases. Our method installs a different (divergent) index configuration to each replica, thus specializing replicas for different subsets of the database workload. We analyze the space of divergent designs and show that there is a tension between the specialization of each replica and the ability to load-balance the database workload across different replicas. Based on our analysis, we develop an algorithm to compute good divergent designs that can balance this trade-off. Experimental results demonstrate the efficacy of our approach.

  • 2011 ISSDM Day—Filtering Semi-Structured Documents Based on Faceted Feedback
    Carla Kuiken, Lanbo Zhang, Yi Zhang

    Existing adaptive filtering systems learn user profiles based on users' relevance judgments on documents. In some cases, users have some prior knowledge about what features are important for a document to be relevant. For example, a Spanish speaker may only want news written in Spanish, and thus a relevant document should contain the feature "Language: Spanish"; a researcher working on HIV knows an article with the medical subject "MeSH: AIDS" is very likely to be interesting to him/her.

    Semi-structured documents with rich faceted metadata are increasingly prevalent over the Internet. Motivated by the commonly used faceted search interface in e-commerce, we study whether users' prior knowledge about faceted features could be exploited for filtering semi-structured documents. We envision two faceted feedback solicitation mechanisms, and propose a novel user profile-learning algorithm that can incorporate user feedback on features. To evaluate the proposed work, we use two data sets from the TREC filtering track, and conduct a user study on Amazon Mechanical Turk. Our experimental results show that user feedback on faceted features is useful for filtering. The new user profile learning algorithm can effectively learn from user feedback on faceted features and performs better than several other methods adapted from the feature-based feedback techniques proposed for retrieval and text classification tasks in previous work.

  • 2011 ISSDM Day—FLAMBES: Evolving Fast Performance Models
    John Bent, Stephan Eidenbenz, Meghan McClelland, Adam Crume, Carlos Maltzahn, Neoklis Polyzotis, Manfred Warmuth

    Large clusters and supercomputers are simulated to aid in design. Many devices, such as hard drives, are slow to simulate. Our approach is to use a genetic algorithm to fit parameters for an analytical model of a device. Fitting focuses on aggregate accuracy rather than request-level accuracy since individual request times are irrelevant in large simulations. The model is fitted to traces from a physical device or a known device-accurate model. This is done once, offline, before running the simulation. Execution of the model is fast, since it only requires a modest amount of floating point math and no event queuing. Only a few floating-point numbers are needed for state. Compared to an event-driven model, this trades a little accuracy for a large gain in performance.

  • 2011 ISSDM Day—Halo Finder vs Local Extractors: Similarities and Differences
    Christopher Brislawn, Uliana Popov, Alex Pang

    Multi-streaming events are of great interest to astrophysics because they are associated with the formation of largescale structures (LSS) such as halos, filaments and sheets. Until recently, these events were studied using scalar density field only. In this talk, we present a new approach that takes into account the velocity field information in finding these multistreaming events. Six different velocity based feature extractors are defined, and their findings are compared to a halo finder results.

  • 2011 ISSDM Day—Insertion-optimized File System
    James Ahrens, Michael Lang, Latchesar Ionkov, Scott Brandt, Carlos Maltzahn

    Gostor is an experimental platform for testing new file storage ideas for post POSIX usage. Gostor provides greater flexibility for manipulating the data within the file, including inserting and deleting data anywhere in the file, creating and removing holes in the data, etc. Each modification of the data creates a new file. Gostor doesn't implement any ways of organizing the files in hierarchical structures, or mapping them to strings. Thus Gostor can be used to implement standard file systems as well as experimenting with new ways of storing and accessing users' data.

  • 2011 ISSDM Day—Keynote Address: “Los Alamos National Laboratory, a Unique, Irreplaceable, National Resource in the Department of Energy”
    Gary Grider

    The talk will provide an unclassified overview of the Los Alamos National Laboratory, its people, programs, and capabilities. The talk touches on much of the diverse science going on that the laboratory in areas such as materials, biology, cosmology, energy, and climate. A drill down in the area of information science, computer science, and high performance computing, is also provided.

  • 2011 ISSDM Day—Managing High-Bandwidth Real-Time Data Storage
    John Bent, HB Chen, Gary Grider, Meghan McClelland, James Nunez, David Bigelow, Scott Brandt

    In an information-driven world, the ability to capture and store data in real-time is of the utmost importance. The scope and intent of such data capture, however, varies widely. Individuals record television programs for later viewing, governments maintain vast sensor networks to warn against calamity, scientists conduct experiments requiring immense data collection, and automated monitoring tools supervise a host of processes which human hands rarely touch. All such tasks have the same basic requirements -- guaranteed capture of streaming real-time data -- but with greatly differing parameters. Our ability to process and interpret data has grown faster than our ability store and manage it, which has led to the curious condition of being able to recognize the importance of data without being able to store it, and hence unable to later profit by it. 3
    Traditional storage mechanisms are not well suited to manage this type of data and we have developed a large-scale ring buffer storage architecture to handle it. Our system is well suited to both large and small data elements, has a native indexing mechanism, and can maintain reliability in the face of hardware failure. Strong performance guarantees can be made and kept, and quality of service requirements maintained.

  • 2011 ISSDM Day—Push-based Processing of Scientific Data
    John Bent, Meghan McClelland, James Nunez, Noah Watkins, Scott Brandt, Carlos Maltzahn

    Large-scale scientific data is collected through experiment and produced by simulation. This data in turn is commonly interrogated using ad-hoc analysis queries, and visualized with differing interactivity requirements. At extreme scale this data can be too large to store multiple copies, or may be easily accessible for only a short period of time. In either case, multiple consumers must be able to interact with the data. Unfortunately, as the number of concurrent users accessing storage media increases the throughput can decrease significantly. This performance degradation is due to the induced random access pattern that results from uncoordinated I/O streams. One common approach to this problem is to use collective I/O, unfortunately this is difficult to do for many independent computations. We are investigating a data centric, push-based approach inspired by work within the database community that has achieved an order of magnitude increase in throughput for concurrent query processing. A push-based approach to query processing uses a single data stream originating off of storage media rather than allowing multiple requests to compete, and utilizes work and data sharing opportunities exposed through query semantics.  There are many challenges that exist in this work, notably supporting a distributed execution environment, providing a mix of access performance requirements (throughput vs. latency), and support for multiple data models including relational and arraybased.

  • 2011 ISSDM Day—QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language
    Sasha Ames, Maya Gokhale

    File system metadata management has become a bottleneck for many data-intensive applications that rely on highperformance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's highperformance file systems store 7 to 9 orders of magnitude more data, resulting in numbers of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data.

    Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems.

    To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads.

  • 2011 ISSDM Day—Rad-Flows: Buffering for Predictable Communications
    Kleoni Ioannidou, Scott Brandt

    Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well understood, but the interaction of communicating tasks with different timing characteristics is less well understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.

  • 2011 ISSDM Day—RAID4S: Supercharging RAID Small Writes with SSD
    John Bent, Gary Grider, Meghan McClelland, James Nunez, Rosie Wacha, Scott Brandt, Carlos Maltzahn

    Parity-based RAID techniques improve data reliability and availability, but at a significant performance cost, especially for small writes. Flash-based solid state drives (SSDs) provide faster random I/O and use less power than hard drives, but are too expensive to substitute for all of the drives in most large-scale storage systems. We present RAID4S, a costeffective, high-performance technique for improving RAID small-write performance using SSDs for parity storage in a diskbased RAID array. Our results show that a 4HDD+1SSD RAID4S array achieves throughputs 3.3X better than a similar 4+1 RAID4 array and 1.75X better than a 4+1 RAID5 array on small-write-intensive workloads. RAID4S has no performance penalty on disk workloads consisting of up to 90% reads and its benefits are enhanced by the effects of file systems and caches.

  • 2011 ISSDM Day—SciHadoop: Array-based Query Processing in Hadoop
    James Ahrens, John Bent, Gary Grider, Michael Lang, Joe Buck, Scott Brandt, Maya Gokhale, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis, Wang-Chiew Tan

    Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over arraybased data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.

  • 2011 ISSDM Day—The Stanford-UC Santa Cruz Project for Cooperative Computing with Algorithms, Data and People
    Neoklis Polyzotis

    This talk will provide an overview of the SCOOP project, whose broad theme is to leverage people as processing units to achieve some global objective. A primary focus of SCOOP is to optimize the usage of human computation in order to use as few resources (e.g., time, money) as possible while maximizing the quality of the final output. Our approach is based on the principle of declarative languages that has been applied very successfully in database systems. The talk will describe the main research thrusts in SCOOP and some of our recent accomplishments.

  • 2011 ISSDM Day—Using reputation systems to increase content quality: lessons from Wikipedia and Google Maps
    Luca de Alfaro
  • 2011 ISSDM Day—Viral Genomics and the Semantic Web
    Tracy Holsclaw

    Gaussian process (GP) models provide non-parametric methods to fit continuous curves observed with noise.
    Motivated by our investigation of dark energy, we develop a GP-based inverse method that allows for the direct estimation of the derivative of a curve. In principle, a GP model may be fit to the data directly, with the derivatives obtained by means of differentiation of the correlation function. However, it is known that this approach can be inadequate due to loss of information when differentiating. We present a new method of obtaining the derivative process by viewing this procedure as an inverse problem. We use the properties of a GP to obtain a computationally efficient fit. We illustrate our method with simulated data as well as apply it to our cosmological application.

  • 2011 ISSDM Day—Viral Genomics and the Semantic Web
    Carla Kuiken

    The “HIV database and analysis platform” has been maintained in Los Alamos for 22 years, and has grown to be an internationally renowned resource for HIV data analysis. It is in the process of expanding to include hepatitis C virus and hemorrhagic fever viruses; the eventual goal is to make it a universal viral resource. This expansion necessitates much greater reliance on external data and information sources. These resources rarely use the same identifies and frequently contain annotator- and submitter-specific language. While efforts have been underway for some time to standardize and cross-link biological information on the web, there still is a long way to go. I will describe current status of the “Viral data analysis platform”, the (semantic) problems we have grappled with, and the local and global efforts at amelioration.

  • HCOMP'11—CrowdSight: Rapidly Prototyping Intelligent Visual Processing Apps
    Reid Porter, Mario Rodriguez, James Davis
  • IBM—Filtering Semi-Structured Documents Based on Faceted Feedback
    Lanbo Zhang, Yi Zhang
  • ICS2011—On the Role of NVRAM in Data Intensive HPC Architectures
    Sasha Ames, Maya Gokhale
  • NAS2011—QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language
    Sasha Ames, Maya Gokhale, Carlos Maltzahn
  • RTAS2011—On the Role of NVRAM in Data Intensive HPC Architectures
    Roberto Pineiro, Kleoni Ioannidou, Scott Brandt, Carlos Maltzahn
  • SC11—SciHadoop: Array-based Query Processing in Hadoop
    Kleoni Ioannidou, Joe Buck, Noah Watkins, Jeff LeFevre, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis

    Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop’s byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce Sci-Hadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a Sci-Hadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.

  • SIGIR11—Filtering Semi-Structured Documents Based on Faceted Feedback
    Lanbo Zhang, Yi Zhang

    Special Interest Group on Information Retrievel (SIGIR)

    2010 Panels

  • PDSW10—5th Petascale Data Storage Workshop Supercomputing '10
    Carlos Maltzahn
  • 2010 Posters

  • 2010 ISSDM Day: Eye Tracking for Personalized Photography
    Sriram Swaminarayan, Steve Scher, James Davis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: File System Trace and Replay
    James Ahrens, John Bent, Carolyn Connor, Gary Grider, Joe Buck, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: File System Trace and Replay
    James Ahrens, John Bent, Carolyn Connor, Gary Grider, Noah Watkins, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: How do multi streaming regions form and evolve?
    Katrin Heitmann, Uliana Popov, Alex Pang

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Location-based Object Detection
    James Theiler, Damian Eads, David Helmbold

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Managing High-Bandwidth Real-Time Data Storage
    John Bent, HB Chen, David Bigelow, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: RAID4S Hardware Performance with a Linux Software RAID
    John Bent, Gary Grider, Meghan McClelland, James Nunez, Rosie Wacha, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Scalable Simulation of Parallel File Systems
    John Bent, Gary Grider, Meghan McClelland, James Nunez, Esteban Molina-Estolano, Carlos Maltzahn

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Statistical Modeling of Dark Energy and the Cosmological Constants
    Ujjaini Alam, Salman Habib, Katrin Heitmann, David Higdon, Tracy Holsclaw, Herbie Lee, Bruno Sanso

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: WikiTrust Turning Wikipedia Quantity into Quality
    Shelly Spearing, Ian Pye, Bo Adler, Luca de Alfaro

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Adaptive Information Filtering
    Carla Kuiken, Lanbo Zhang, Yi Zhang

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Memory as an I/O bottleneck - facts and consequences for high-performance data management.
    Tim Kaldewey, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: On-line Index Selection for Physical Database Tuning
    John Bent, Karl Schnaitter, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language
    Sasha Ames, Carlos Maltzahn

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: RAD-FLOWS: Buffer management for predictable performance
    Roberto Pineiro, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 Mini-Showcase—File System Trace and Replay Redux
    Meghan McClelland, Noah Watkins
  • 2010 Mini-Showcase—Improving RAID Performance with Solid State Drives
    Meghan McClelland, Rosie Wacha
  • 2010 Mini-Showcase—Managing High-Bandwidth Real-Time Streaming Data
    John Bent, HB Chen, David Bigelow
  • 2010 Mini-Showcase—Scalable Simulation of Parallel Filesystems
    John Bent, Esteban Molina-Estolano
  • CIKM10—Discriminative Factored Prior Model for Personalized Content-Based Recommendation
    Lanbo Zhang, Yi Zhang
  • CMU PDL Visit Day—PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud
    John Bent, Esteban Molina-Estolano, Milo Polte, Scott Brandt, Garth Gibson, Carlos Maltzahn

    Parallel Data Lab (PDL)

  • Cosmic Calibration - Statistical Modeling for Dark Energy
    Ujjaini Alam, Salman Habib, Katrin Heitmann, Tracy Holsclaw, Herbie Lee, Bruno Sanso
  • EuroSys 2010—RAID4S: Adding SSDs to RAID Arrays
    John Bent, Rosie Wacha, Scott Brandt, Carlos Maltzahn
  • PDSW10—PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud
    John Bent, Esteban Molina-Estolano, Milo Polte, Scott Brandt, Garth Gibson, Maya Gokhale
  • PDSW10—QMDS: A File System Metadata Service Supporting a Graph Data Model-Based Query Language
    Sasha Ames, Maya Gokhale, Carlos Maltzahn
  • USENIX FAST 10—Design and Implementation of a Metadata-Rich File System
    Sasha Ames, Maya Gokhale, Carlos Maltzahn

    File and Storage Technologies (FAST)

  • USENIX FAST 10—Enabling Scientific Application I/O on Cloud FileSystems
    John Bent, Meghan McClelland, Esteban Molina-Estolano, Milo Polte, Scott Brandt, Garth Gibson, Carlos Maltzahn

    File and Storage Technologies (FAST)

  • USENIX FAST 10—Energy Efficient Striping in the Energy Aware Virtual File System
    Adam Manzanares, Xiao Qin

    File and Storage Technologies (FAST)

  • USENIX FAST 10—InfoGarden: A Casual-Game Approach to Digital Archive Management
    Carlos Maltzahn, Michael Mateas, Jim Whitehead

    File and Storage Technologies (FAST)

  • USENIX FAST 10—RAID4S: Adding SSDs to RAID Arrays
    John Bent, Rosie Wacha, Scott Brandt, Carlos Maltzahn

    File and Storage Technologies (FAST)

  • 2010 Presentations

  • 2010 ISSDM Day: Eye Tracking for Personalized Photography
    Sriram Swaminarayan, Steve Scher, James Davis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: File System Trace and Replay
    James Ahrens, John Bent, Carolyn Connor, Gary Grider, Noah Watkins, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: File System Trace and Replay
    James Ahrens, John Bent, Carolyn Connor, Gary Grider, Joe Buck, Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: How do multi streaming regions form and evolve?
    Katrin Heitmann, Uliana Popov, Alex Pang

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Location-based Object Detection
    James Theiler, Damian Eads, David Helmbold

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Managing High-Bandwidth Real-Time Data Storage
    John Bent, HB Chen, David Bigelow, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: RAID4S Hardware Performance with a Linux Software RAID
    John Bent, Gary Grider, Meghan McClelland, James Nunez, Rosie Wacha, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Scalable Simulation of Parallel File Systems
    John Bent, HB Chen, Gary Grider, Meghan McClelland, James Nunez, Esteban Molina-Estolano, Carlos Maltzahn

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Statistical Modeling of Dark Energy and the Cosmological Constants
    Ujjaini Alam, Salman Habib, Katrin Heitmann, David Higdon, Tracy Holsclaw, Herbie Lee, Bruno Sanso

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: WikiTrust Turning Wikipedia Quantity into Quality
    Shelly Spearing, Ian Pye, Bo Adler, Luca de Alfaro

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Adaptive Information Filtering
    Carla Kuiken, Lanbo Zhang, Yi Zhang

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Exa‐Scale FSIO - Can we get there? Can we afford to?
    Gary Grider

    Abstract: This talk will describe the anticipated DOE Exascale initiative, a prospective very large extreme scale supercomputing program being formulated by DOE Office of Science and DOE NNSA. Motivations for the program as well as how the program may proceed will be presented. Anticipated Exascale machine dimensions will be provided as well. An analysis of the costs of providing scalable file systems and I/O for these future very large supercomputers will be examined in detail.

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Investigating Efficient Real-Time Performance Guarantees on Storage Networks
    Andrew Shewmaker

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: Memory as an I/O bottleneck - facts and consequences for high-performance data management.
    Tim Kaldewey, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: On-line Index Selection for Physical Database Tuning
    John Bent, Karl Schnaitter, Neoklis Polyzotis

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language
    Sasha Ames, Carlos Maltzahn

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 ISSDM Day: RAD-FLOWS: Buffer management for predictable performance
    Roberto Pineiro, Scott Brandt

    Institute for Scalable Scientific Data Management (ISSDM)

  • 2010 Mini-Showcase—File System Trace and Replay Redux
    Meghan McClelland, Noah Watkins
  • 2010 Mini-Showcase—Improving RAID Performance with Solid State Drives
    Meghan McClelland, Rosie Wacha
  • 2010 Mini-Showcase—Managing High-Bandwidth Real-Time Streaming Data
    John Bent, HB Chen, David Bigelow
  • 2010 Mini-Showcase—Scalable Simulation of Parallel Filesystems
    John Bent, Esteban Molina-Estolano
  • 26th IEEE Symposium (MSST2010): Quality of Service Guarantees in High-Bandwidth, Real-Time Streaming Data Storage
    John Bent, HB Chen, David Bigelow, Scott Brandt

  • EuroSys 2010—RAID4S: Adding SSDs to RAID Arrays
    John Bent, Rosie Wacha, Scott Brandt, Carlos Maltzahn
  • PAN 2010: Detecting Wikipedia Vandalism using WikiTrust
    Ian Pye, Bo Adler, Luca de Alfaro

    PAN 2010 Lab:  Uncovering Plagiarism, Authorship, and Social Software Misuse

  • SIGIR10—Interactive Retrieval Based on Faceted Feedback
    Lanbo Zhang, Yi Zhang

    Special Interest Group on Information Retrievel (SIGIR)

  • SIGMOD 2010—An Automated, yet Interactive and Portable DB designer
    Karl Schnaitter, Neoklis Polyzotis

    Association for Computing Machinery (ACM)

  • USENIX FAST 10—Enabling Scientific Application I/O on Cloud FileSystems
    John Bent, Meghan McClelland, Esteban Molina-Estolano, Milo Polte, Scott Brandt, Garth Gibson, Carlos Maltzahn

    File and Storage Technologies (FAST)

  • USENIX FAST 10—Energy Efficient Striping in the Energy Aware Virtual File System
    Adam Manzanares, Xiao Qin

    File and Storage Technologies (FAST)

  • USENIX FAST 10—InfoGarden: A Casual-Game Approach to Digital Archive Management
    Carlos Maltzahn, Michael Mateas, Jim Whitehead

    File and Storage Technologies (FAST)

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA
Inside | © Copyright 2008-09 Los Alamos National Security, LLC All rights reserved | Disclaimer/Privacy | Web Contact