Research
Current Projects
The CI Compass project: "CI CoE: CI Compass: An NSF CI Center of Excellence for Navigating the Major Facilities Data Lifecycle" provides expertise and active support to cyberinfrastructure practitioners at NSF Major Facilities in order to accelerate the data lifecycle and ensure the integrity and effectiveness of the cyberinfrastructure upon which research and discovery depend.
The vision for CI Compass is to be the leader in supporting and enhancing the national CI ecosystem that includes people, practical knowledge, and processes to facilitate knowledge sharing and discovery across the National Science Foundation's (NSF) Major Facilities (MFs).
Role: co-Principal Investigator
The X-CITE project: "CyberInfrastructure Training and Education for Synchrotron X-Ray Science (X-CITE)" develops an innovative cyberinfrastrcture (CI) training program for the community of scientists using the CHESS synchrotron X-ray facility, who conduct research in materials science, physics, chemistry, biology, environmental science, and other domains. The training materials and associated training activities help reduce barriers to the use of CHESS instruments, data, and tools and enable scientists to effectively utilize the national computing resources and services, such as those offered by the NSF.
The training program covers five relevant thematic areas: programming essentials, systems fundamentals, distributed computing with the CI ecosystem, X-ray science software, and issues of data curation and FAIR. These are offered in a multitude of modes like self-paced training with notebooks, videos, and CI catalogs, office hours, in-person instruction sessions, CHESS user workshops, and tutorials at domain conferences.
Role: Principal Investigator
The SWARM project: "Scientific Workflow Applications on Resilient Metasystem" explores how distributed intelligence, specifically, swarm intelligence (SI), can provide robust, performant, resilient, and fault-tolerant execution of DOE scientific workflows that span across a continuum of resources from edge devices near sensors and instruments through wide area networks to leadership-class systems. The goal is to design SI-based resilient Integrated Research Infrastructure (IRI) that can quickly recover from failures, adapt to changes in the environment, maximize overall resource utilization, and optimize the execution time of workflows submitted by DOE scientists.
Role: co-Principal Investigator
The PosEiDon project: "PosEiDon: Platform for Explainable Distributed Infrastructure" aims to advance the knowledge of how simulation and machine learning (ML) methodologies can be harnessed and amplified to improve DOE’s computational and data science. PosEiDon will provide an integrated platform that helps facility operators and scientists improve the overall end-to-end science workflow by (1) predicting the performance of complex workflows; (2) detecting and classifying infrastructure and workflow anomalies and "explaining" the sources of these anomalies; and (3) suggesting performance optimizations.
Role: co-Principal Investigator
The FlyNet project: "An 'On-the-fly' Deeply Programmable End-to-end Network-Centric Platform for Edge-to-Core Workflows" will provide an architecture and tools that will enable scientists to include edge computing devices in their computational workflows. This capability is critical for low latency applications like drone video analytics and route planning for drones. We will integrate cutting edge network and compute infrastructure with in-network processing through new programming abstractions. It will leverage the Pegasus Workflow Management System to integrate these capabilities.
Role: co-Principal Investigator
Previous Projects
The Delivering a "Dynamic Network-Centric Platform for Data-Driven Science" (DyNamo) project, a NSF CC* Integration project, is developing a network-centric platform to enable high performance, adaptive data flows and coordinated access to multi-campus cyberinfrastructure facilities and community data repositories for observational scientists in adaptive weather sensing and ocean sciences.
Role: Principal Investigator
The CI CoE Pilot project is developing a model for a Cyberinfrastructure Center of Excellence (CI COE) that facilitates community building and knowledge sharing, and applies best practices and innovative solutions for NSF Large Facility CI. CI CoE Pilot provides leadership, expertise, and active support to CI practitioners at NSF Major Facilities.
Role: co-Principal Investigator
The goal of the NSF CICI "Integrity Introspection for Scientific Workflows" (IRIS) project is to automatically detect, diagnose, and pinpoint the source of unintentional integrity anomalies in scientific workflows executing on distributed CI. The approach is to develop an appropriate threat model and incorporate it in an integrity analysis framework that collects workflow and infrastructure data and uses machine learning (ML) algorithms to perform the needed analysis.
Role: Principal Investigator
The Panorama 360 project provides a resource for the collection, analysis, and sharing of performance data about end-to-end scientific workflows executing on Department of Energy facilities.
Role: co-Principal Investigator
Funding: GENI initiative, National Science Foundation
ExoGENI is a networked cloud testbed that links GENI to two advances in virtual infrastructure services outside of GENI: open cloud computing (OpenStack) and dynamic circuit fabrics. ExoGENI orchestrates a federation of independent cloud sites located across the US and circuit providers, like Internet2 and ESnet through their native IaaS API interfaces, and links them to other GENI tools and resources. ExoGENI is, in effect, a widely distributed networked infrastructure-as-a-service (NIaaS) platform geared towards experimentation and computational tasks.
Role: Senior personnel
The SWIP project: "Scientific Workflow Integrity with Pegasus (SWIP)" strengthened cybersecurity controls in the Pegasus Workflow Management System in order to provide assurances with respect to the integrity of computational scientific methods. These strengthened controls enhanced both Pegasus’ handling of science data and its orchestration of software-defined networks and infrastructure. The result was increased trust in computational science and increased assurance in our ability to reproduce the science by allowing scientists to validate that data has not been changed since a workflow completed and that the results from multiple workflows are consistent.
Role: Senior personnel
The ADAMANT project: "Transforming Computational Science with ADAMANT (Adaptive Data-aware Multi-domain Application Network Topologies)" enabled computational workflow-driven science on multi-domain IaaS. We integrated ORCA resource provisioning with Pegasus WMS. We enabled provisioning of compute/storage/network resources by workflows in response to their needs through application-specific topology embedding. We enabled pre-planned movements of data over engineered dynamic connections between domains in support of computational workflow tasks.
Role: Technical lead and senior personnel