Accelerated Weighted Ensemble for Improved Protein Folding Statistics
The Accelerated Weighted Ensemble (AWE) approach addresses the challenges faced by traditional molecular dynamics (MD) simulations in generating statistically significant kinetic data for protein folding. By utilizing methods such as WorkQueue and Condor, AWE enhances efficiency and accuracy in studying protein folding kinetics. The strategy involves resampling models, simulating short timeframes, and tracking macrostate transitions to capture non-Markovian behavior. AWE's partitioning based on existing kinetic data optimizes the weighting process, leading to improved outcomes in protein folding studies.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
The Accelerated Weighted Ensemble Greatly Improved Protein Folding Statistics Using WorkQueue and Condor Jeff Kinnison & Dr. Jesus A. Izaguirre
Studying a New Protein HP24stab Subdomain of the Villin headpiece Two-helical supersecondary structure 24 amino acids (406 atoms) Discovered in 2015, little kinetic information available
Problems with Traditional MD Computationally Expensive Molecular force fields perform expensive operations on all atoms Timescales of interest quickly become intractable with protein size GPU resources to increase efficiency are not always readily available Events of Interest are Rare Protein folding occurs on O(ns) to O(ms) scale There is no guarantee that a folding event will occur in a given simulation With these two issues, it is difficult to generate enough data to make statistically significant kinetic approximations.
Accelerated Weighted Ensemble (AWE) 2. Resample to maintain the number of models in each state 1. Simulate a number of models for a short time 3. Repeat until fluxes converge Additionally, assign each state to a macrostate (folded, transition, unfolded) and track macrostate transitions to account for non-Markovian behavior.
AWE Partition Free Energy Surface of HP24stab Partition Following Transition Pathway The partition in AWE is based on existing kinetic data, approximating the correct weights.
Distributing Simulations with WorkQueue Each simulation is independent, so parallelize simulations to increase efficiency WorkQueue allows scaling to the number of simulations in a particular AWE run AWE includes task cloning to overcome bottlenecks caused by slow worker
Preliminary Trajectory Data We created the AWE partition by collecting trajectory data using traditional MD on GPU. Each trajectory took 4 days to complete. Of the 36 trajectories collected, 19 were valid and only 9 contained folding events. Folding first passage times for the nine original trajectories that folded.
AWE Setup Two Systems WorkQueue 1000-cell Maintained a factory requesting between 100 and 1000 workers 100-cell 10 models per state All simulations run on 4-core workers MD Parameters T = 325K Used Condor workers only to prevent AWE workers from taking over the cluster Langevin Dynamics with implicit solvent ( = .91ps-1) Amber03 force field 250ps simulation time
AWE Condor Usage 100-Cell Partition Simulations Per Day 1000-Cell Partition Simulations Per Day By leveraging WorkQueue and Condor, we were able to run O(10k) simulations per day.
AWE Results Started with 19 microseconds of traditional MD trajectory data containing nine folding events computed over one month.
Conclusion Both the coarse and fine partitions converged in one-sixth the time needed to generate the original trajectories and generated several orders of magnitude more folding events. By leveraging WorkQueue and Condor, AWE is able to quickly generate reliable approximations of protein kinetic properties.
Acknowledgements We would like to thank Dr. Douglas Thain and the Cooperative Computing Lab students for making WorkQueue available and helping to integrate it with AWE. All computations were run on compute nodes provided by the Notre Dame Center for Research Computing.
References Hocking, H. G.; H se, F.; Madl, T.; Zacharias, M.; Rief, M.; old k, G. A Compact Native 24- Residue Supersecondary Structure Derived from the Villin Headpiece Sub- Domain. Biophys. J. 2015, 108, 678 686. Huber, G. A.; Kim, S. Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophys. J. 1996, 70, 97. Bhatt, D.; Zhang, B. W.; Zuckerman, D. M. Steady-state simulations using weighted ensemble path sampling. J Chem. Phys. 2010, 133, 014110. Abdul-Wahid, B.; Yu, L.; Rajan, D.; Feng, H.; Darve, E.; Thain, D.; Izaguirre, J. A. Folding Proteins at 500 ns/hour with Work Queue. E-Science (e-Science), 2012 IEEE 8th International Conference on. 2012; pp 1 8.