Accelerated Weighted Ensemble for Improved Protein Folding Statistics

 
 
T
h
e
 
A
c
c
e
l
e
r
a
t
e
d
 
W
e
i
g
h
t
e
d
 
E
n
s
e
m
b
l
e
 
 
Greatly Improved Protein Folding Statistics
Using WorkQueue and Condor
 
 
Jeff Kinnison & Dr. Jesus A. Izaguirre
 
Studying a New Protein
 
HP24stab
Subdomain of the Villin
headpiece
Two-helical supersecondary
structure
24 amino acids (406 atoms)
Discovered in 2015, little kinetic
information available
 
Problems with Traditional MD
 
Computationally Expensive
Molecular force fields perform expensive operations on all atoms
Timescales of interest quickly become intractable with protein size
GPU resources to increase efficiency are not always readily available
Events of Interest are Rare
Protein folding occurs on O(ns) to O(ms) scale
There is no guarantee that a folding event will occur in a given
simulation
 
 
With these two issues, it is difficult to generate enough data
to make statistically significant kinetic approximations.
 
Accelerated Weighted Ensemble (AWE)
 
1. Simulate a
number of
models for a
short time
 
2. Resample
to maintain
the number
of models in
each state
 
3. Repeat
until fluxes
converge
 
Additionally, assign each state to a macrostate (folded, transition, unfolded) and track
macrostate transitions to account for non-Markovian behavior.
 
AWE Partition
 
The partition in AWE is based on existing kinetic data, approximating the correct weights.
 
Free Energy Surface of HP24stab
 
Partition Following Transition Pathway
 
 Distributing Simulations with
WorkQueue
 
Each simulation is independent, so parallelize simulations to
increase efficiency
WorkQueue allows scaling to the number of simulations in a
particular AWE run
AWE includes task cloning to overcome bottlenecks caused by
slow worker
 
 
 Preliminary Trajectory Data
 
We created the AWE partition by
collecting trajectory data using
traditional MD on GPU. Each
trajectory took 4 days to complete.
 
Of the 36 trajectories collected, 19
were valid and only 9 contained
folding events.
 
Folding first passage times for the nine original
trajectories that folded.
 
AWE Setup
 
Two Systems
1000-cell
100-cell
10 models per state
MD Parameters
T = 325K
Langevin Dynamics with implicit
      solvent (λ = .91ps
-1
)
Amber03 force field
250ps simulation time
 
WorkQueue
Maintained a factory requesting
between 100 and 1000 workers
All simulations run on 4-core
workers
Used Condor workers only to
prevent AWE workers from taking
over the cluster
 
AWE Condor Usage
 
AWE Condor Usage
 
By leveraging WorkQueue and Condor, we were able to
run O(10k) simulations per day.
 
100-Cell Partition Simulations Per Day
 
1000-Cell Partition Simulations Per Day
 
AWE Results
 
Started with 
19 microseconds
 of traditional MD trajectory data containing 
nine
folding events
 computed over 
one month
.
 
Conclusion
 
Both the coarse and fine partitions converged in 
one-sixth the
time
 needed to generate the original trajectories and generated
several orders of magnitude more
 folding events.
 
 
 
By leveraging WorkQueue and Condor, AWE is able to quickly
generate reliable approximations of protein kinetic properties.
 
 
Acknowledgements
 
We would like to thank Dr. Douglas Thain and the Cooperative Computing
Lab students for making WorkQueue available and helping to integrate it
with AWE.
 
All computations were run on compute nodes provided by the Notre Dame
Center for Research Computing.
 
References
Hocking, H. G.; Häse, F.; Madl, T.; Zacharias, M.; Rief, M.; Žoldák, G. A Compact Native 24-
Residue Supersecondary Structure Derived from the Villin Headpiece Sub- Domain. Biophys. J.
2015, 108, 678–686.
Huber, G. A.; Kim, S. Weighted-ensemble Brownian dynamics simulations for protein
association reactions. Biophys. J. 1996, 70, 97.
Bhatt, D.; Zhang, B. W.; Zuckerman, D. M. Steady-state simulations using weighted ensemble
path sampling. J Chem. Phys. 2010, 133, 014110.
Abdul-Wahid, B.; Yu, L.; Rajan, D.; Feng, H.; Darve, E.; Thain, D.; Izaguirre, J. A. Folding
Proteins at 500 ns/hour with Work Queue. E-Science (e-Science), 2012 IEEE 8th International
Conference on. 2012; pp 1–8.
 
 
Slide Note
Embed
Share

The Accelerated Weighted Ensemble (AWE) approach addresses the challenges faced by traditional molecular dynamics (MD) simulations in generating statistically significant kinetic data for protein folding. By utilizing methods such as WorkQueue and Condor, AWE enhances efficiency and accuracy in studying protein folding kinetics. The strategy involves resampling models, simulating short timeframes, and tracking macrostate transitions to capture non-Markovian behavior. AWE's partitioning based on existing kinetic data optimizes the weighting process, leading to improved outcomes in protein folding studies.

  • Protein Folding
  • Accelerated Weighted Ensemble
  • Molecular Dynamics
  • WorkQueue
  • Protein Kinetics

Uploaded on Sep 23, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. The Accelerated Weighted Ensemble Greatly Improved Protein Folding Statistics Using WorkQueue and Condor Jeff Kinnison & Dr. Jesus A. Izaguirre

  2. Studying a New Protein HP24stab Subdomain of the Villin headpiece Two-helical supersecondary structure 24 amino acids (406 atoms) Discovered in 2015, little kinetic information available

  3. Problems with Traditional MD Computationally Expensive Molecular force fields perform expensive operations on all atoms Timescales of interest quickly become intractable with protein size GPU resources to increase efficiency are not always readily available Events of Interest are Rare Protein folding occurs on O(ns) to O(ms) scale There is no guarantee that a folding event will occur in a given simulation With these two issues, it is difficult to generate enough data to make statistically significant kinetic approximations.

  4. Accelerated Weighted Ensemble (AWE) 2. Resample to maintain the number of models in each state 1. Simulate a number of models for a short time 3. Repeat until fluxes converge Additionally, assign each state to a macrostate (folded, transition, unfolded) and track macrostate transitions to account for non-Markovian behavior.

  5. AWE Partition Free Energy Surface of HP24stab Partition Following Transition Pathway The partition in AWE is based on existing kinetic data, approximating the correct weights.

  6. Distributing Simulations with WorkQueue Each simulation is independent, so parallelize simulations to increase efficiency WorkQueue allows scaling to the number of simulations in a particular AWE run AWE includes task cloning to overcome bottlenecks caused by slow worker

  7. Preliminary Trajectory Data We created the AWE partition by collecting trajectory data using traditional MD on GPU. Each trajectory took 4 days to complete. Of the 36 trajectories collected, 19 were valid and only 9 contained folding events. Folding first passage times for the nine original trajectories that folded.

  8. AWE Setup Two Systems WorkQueue 1000-cell Maintained a factory requesting between 100 and 1000 workers 100-cell 10 models per state All simulations run on 4-core workers MD Parameters T = 325K Used Condor workers only to prevent AWE workers from taking over the cluster Langevin Dynamics with implicit solvent ( = .91ps-1) Amber03 force field 250ps simulation time

  9. AWE Condor Usage

  10. AWE Condor Usage 100-Cell Partition Simulations Per Day 1000-Cell Partition Simulations Per Day By leveraging WorkQueue and Condor, we were able to run O(10k) simulations per day.

  11. AWE Results Started with 19 microseconds of traditional MD trajectory data containing nine folding events computed over one month.

  12. Conclusion Both the coarse and fine partitions converged in one-sixth the time needed to generate the original trajectories and generated several orders of magnitude more folding events. By leveraging WorkQueue and Condor, AWE is able to quickly generate reliable approximations of protein kinetic properties.

  13. Acknowledgements We would like to thank Dr. Douglas Thain and the Cooperative Computing Lab students for making WorkQueue available and helping to integrate it with AWE. All computations were run on compute nodes provided by the Notre Dame Center for Research Computing.

  14. References Hocking, H. G.; H se, F.; Madl, T.; Zacharias, M.; Rief, M.; old k, G. A Compact Native 24- Residue Supersecondary Structure Derived from the Villin Headpiece Sub- Domain. Biophys. J. 2015, 108, 678 686. Huber, G. A.; Kim, S. Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophys. J. 1996, 70, 97. Bhatt, D.; Zhang, B. W.; Zuckerman, D. M. Steady-state simulations using weighted ensemble path sampling. J Chem. Phys. 2010, 133, 014110. Abdul-Wahid, B.; Yu, L.; Rajan, D.; Feng, H.; Darve, E.; Thain, D.; Izaguirre, J. A. Folding Proteins at 500 ns/hour with Work Queue. E-Science (e-Science), 2012 IEEE 8th International Conference on. 2012; pp 1 8.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#