Workflow Management and Systematic Organization

undefined
Introduction to
CONNJUR Workflow Builder
and Yes Workflow
2017 Summer Workshop: June 29, 2017
Workflows (Wikipedia)
A 
workflow
 consists of an 
orchestrated
 and 
repeatable
pattern of business activity enabled by the systematic
organization of resources into processes that transform
materials, provide services, or process information. It
can be depicted as a 
sequence of operations
, declared
as work of a person or group, an organization of staff, or
one or more simple or complex mechanisms.
From a more abstract or higher-level perspective,
workflow may be considered a view or representation of
real work. 
The flow being described may refer to a
document, service or product that is being transferred
from one step to another.
Workflows (Examples)
On the first day Frank described an iterative
“workflow” by which a spectroscopist converts
Varian/Bruker data into nmrPipe format, resolves
ambiguities, performs preliminary processing,
resolves phasing, reprocesses, iterate until done.
Bertram Ludascher: ASAP
Automate
 computation
Scalable
Adaptable for reuse
Provenance: capture processing history and data
lineage
Kepler
Ludäscher, Bertram, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger,
Matthew Jones, Edward A Lee, Jing Tao, and Yang Zhao. 2006. “Scientific
Workflow Management and the Kepler System.” 
Concurrency and
Computation: Practice and Experience
 18 (10): 1039–65.
CONNJUR Workflow Builder
M. Fenwick, G. Weatherby, J. Vyas, C. Sesanker, T. O. Martyn, H. J. Ellis,
and M. R. Gryk, (2015) CONNJUR Workflow Builder: a software integration
environment for spectral reconstruction. 
J Biomol NMR,
 
62
, 313-26.
Provenance Types
     
(Michael Wilde, Argonne Labs)
Prospective Provenance: 
the specification of
the workflows procedure calls and data
dependencies
   
(acqu, workflow)
Retrospective Provenance: 
the recordings of
when and where each procedure ran, and
how each invocation behaved
  
      
(acqus, reconstruction)
Yes Workflow
Annotation system for Prospective Provenance from scripts
Reproducibility
Replicability vs. reproducibility — or is it the
other way around?
Reproducibility, replicability, reusability, repeatability
Reproducibility (Dagstuhl Working Group)
PRIMAD
Platform – Portability vs. reproducibility  (OS or hardware platforms)
Research Objective (goal of computation)
Implementation  (Fast Fourier Transform)
Method  (Fourier Transform)
Actors (Dagstuhl group defines agent as human)
Data (data used in study)
Rauber, A., Braganholo, V., Dittrich, J., et al. (2016). PRIMAD – Information gained by
different types of reproducibility. In Reproducibility of Data-Oriented Experiments in
e-Science (Dagstuhl Seminar 16041). Friere, J., Fuhr, N., & Rauber, A. editors.
Gryk, M. & Ludäscher, B. (2017). Workflows and provenance: Towards information
science solutions for the natural sciences. 
Library Trends
, in press.
Metadata Definition
Definition 1: Data about Data
Workflow -> Provenance -> Reproducibility
All rely on the capture of metadata
Definition 1: Data about Data
Workflow -> Provenance -> Reproducibility
All rely on the capture of metadata
Definition 1: Data about Data
Definition 2: Metadata As Surrogate
Workflow -> Provenance -> Reproducibility
All rely on the capture of metadata
Definition 1: Data about Data
Definition 2: Metadata As Surrogate
Yes Workflow
#@BEGIN main
#@IN raw_turkey 
  
@URI store:shnucks_turkey
#@OUT cooked_turkey
  
@URI plate:delicious_turkey
 
#@BEGIN survey_guests
 
#@OUT food_allergies
 
#@END survey_guests
 
#@BEGIN brining
 
#@IN raw_turkey @URI store:shnucks_turkey
 
#@PARAM seasonings
 
#@OUT brined_turkey
 
#@END brining
 
#@BEGIN weighing
 
#@IN brined_turkey
 
#@OUT weighed_turkey
 
#@OUT weight
 
#@END weighing
 
#@BEGIN stuffing
 
#@IN weighed_turkey
 
#@IN stuffing_ingredients
 
#@IN food_allergies
 
#@OUT stuffed_turkey
 
#@END stuffing
 
#@BEGIN baking
 
#@PARAM weight
 
#@PARAM temperature
 
#@PARAM duration
 
#@IN stuffed_turkey
 
#@OUT cooked_turkey @URI plate:delicious_turkey
 
#@END baking
#@END main
Yes Workflow
echo 'Converting from Varian to NMRPipe format'
var2pipe -in ./fid \
 
-xN
 
1024
 
  -yN   
 
128
  
-zN
 
64  \
 
-xT
 
512
 
  -yT   
 
64
  
-zT
 
32  \
 
-xMODE
 
Complex
 
  -yMODE
 
States-TPPI
 
-zMODE
 
Rance-Kay  \
 
-xSW
 
12000
 
  -ySW  
 
6000
  
-zSW
 
2000  \
 
-xOBS
 
599.5694  -yOBS  
 
125.768
  
-zOBS
 
60.7438  \
 
-xCAR
 
4.772
 
  -yCAR  
 
45
  
-zCAR
 
119  \
 
-xLAB
 
1H
 
  -yLAB  
 
13C
  
-zLAB
 
15N  \
 
-ndim
 
3
 
  -aq2D 
 
States  \
 
-out ./data/hnco%03d.pipe -verb -ov
sleep 
 
1
echo 'Transforming 3-dimensional NMR data!'
echo 'Processing F3/F1 dimensions first'
xyz2pipe -in data/hnco%03d.pipe -x -verb  \
| nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly  \
| nmrPipe -fn CBF -last 12  \
| nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 512 -c 0.5  \
| nmrPipe -fn ZF -size 2048  \
| nmrPipe -fn FT -verb  \
| nmrPipe -fn PS -p0 0 -p1 0 -di  \
| nmrPipe -fn EXT -left -sw -verb  \
| nmrPipe -fn TP  \
| nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 64 -c 0.5  \
| nmrPipe -fn ZF -size 256  \
| nmrPipe -fn FT -verb  \
| nmrPipe -fn PS -p0 0 -p1 0 -di  \
| pipe2xyz -out ft/hnco%03d.ft2 –y
xyz2pipe -in ft/hnco%03d.ft2 -z -verb  \
| nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 32 -c 0.5  \
| nmrPipe -fn ZF -size 128  \
| nmrPipe -fn FT -neg -verb  \
| nmrPipe -fn PS -p0 0 -p1 0 -di  \
| pipe2xyz -out ft/hnco%03d.ft3 -z
Website
16
How much of the above information is NMR
related?
How much is related to nmrPipe or the computer
we 
 
are using?
#! /bin/csh
# My Processing Script
nmrPipe -in mydata.pipe \
| nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \
| nmrPipe -fn CBF -last 12 \
| nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \
| nmrPipe -fn ZF -size 2048 \
| nmrPipe -fn FT -verb \
| nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \
| nmrPipe -fn EXT -left -sw -verb \
| nmrPipe -fn TP \
| nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \
| nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \
| nmrPipe -fn ZF -size 256 \
| nmrPipe -fn FT -verb \
| nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \
-out mydata.ft2 -ov
17
What procedures do we use to massage our data?
What procedures do we use to transform our data?
#! /bin/csh
# My Processing Script
nmrPipe -in mydata.pipe \
| nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \
| nmrPipe -fn CBF -last 12 \
| nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \
| nmrPipe -fn ZF -size 2048 \
| nmrPipe -fn FT -verb \
| nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \
| nmrPipe -fn EXT -left -sw -verb \
| nmrPipe -fn TP \
| nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \
| nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \
| nmrPipe -fn ZF -size 256 \
| nmrPipe -fn FT -verb \
| nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \
-out mydata.ft2 -ov
 
Functionality and Function Order!
Slide Note
Embed
Share

A workflow comprises orchestrated business activities organized into processes that transform materials, provide services, or process information. Learn about different types of workflows, examples, the Kepler system, CONNJUR Workflow Builder, provenance types, and the importance of reproducibility in scientific workflows.

  • Workflow
  • Organization
  • Kepler System
  • Provenance
  • Reproducibility

Uploaded on Mar 10, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Introduction to CONNJUR Workflow Builder and Yes Workflow 2017 Summer Workshop: June 29, 2017

  2. Workflows (Wikipedia) A workflow consists of an orchestrated and repeatable pattern of business activity enabled by the systematic organization of resources into processes that transform materials, provide services, or process information. It can be depicted as a sequence of operations, declared as work of a person or group, an organization of staff, or one or more simple or complex mechanisms. From a more abstract or higher-level perspective, workflow may be considered a view or representation of real work. The flow being described may refer to a document, service or product that is being transferred from one step to another.

  3. Workflows (Examples) On the first day Frank described an iterative workflow by which a spectroscopist converts Varian/Bruker data into nmrPipe format, resolves ambiguities, performs preliminary processing, resolves phasing, reprocesses, iterate until done. Bertram Ludascher: ASAP Automate computation Scalable Adaptable for reuse Provenance: capture processing history and data lineage

  4. Kepler Lud scher, Bertram, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A Lee, Jing Tao, and Yang Zhao. 2006. Scientific Workflow Management and the Kepler System. Concurrency and Computation: Practice and Experience 18 (10): 1039 65.

  5. CONNJUR Workflow Builder M. Fenwick, G. Weatherby, J. Vyas, C. Sesanker, T. O. Martyn, H. J. Ellis, and M. R. Gryk, (2015) CONNJUR Workflow Builder: a software integration environment for spectral reconstruction. J Biomol NMR,62, 313-26.

  6. Provenance Types (Michael Wilde, Argonne Labs) Prospective Provenance: the specification of the workflows procedure calls and data dependencies (acqu, workflow) Retrospective Provenance: the recordings of when and where each procedure ran, and how each invocation behaved (acqus, reconstruction)

  7. Yes Workflow Annotation system for Prospective Provenance from scripts

  8. Reproducibility Replicability vs. reproducibility or is it the other way around? Reproducibility, replicability, reusability, repeatability

  9. Reproducibility (Dagstuhl Working Group) PRIMAD Platform Portability vs. reproducibility (OS or hardware platforms) Research Objective (goal of computation) Implementation (Fast Fourier Transform) Method (Fourier Transform) Actors (Dagstuhl group defines agent as human) Data (data used in study) Rauber, A., Braganholo, V., Dittrich, J., et al. (2016). PRIMAD Information gained by different types of reproducibility. In Reproducibility of Data-Oriented Experiments in e-Science (Dagstuhl Seminar 16041). Friere, J., Fuhr, N., & Rauber, A. editors. Gryk, M. & Lud scher, B. (2017). Workflows and provenance: Towards information science solutions for the natural sciences. Library Trends, in press.

  10. Metadata Definition Definition 1: Data about Data

  11. Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data

  12. Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data Definition 2: Metadata As Surrogate

  13. Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data Definition 2: Metadata As Surrogate

  14. Yes Workflow #@BEGIN main #@IN raw_turkey #@OUT cooked_turkey #@BEGIN survey_guests #@OUT food_allergies #@END survey_guests #@BEGIN brining #@IN raw_turkey @URI store:shnucks_turkey #@PARAM seasonings #@OUT brined_turkey #@END brining #@BEGIN weighing #@IN brined_turkey #@OUT weighed_turkey #@OUT weight #@END weighing #@BEGIN stuffing #@IN weighed_turkey #@IN stuffing_ingredients #@IN food_allergies #@OUT stuffed_turkey #@END stuffing #@BEGIN baking #@PARAM weight #@PARAM temperature #@PARAM duration #@IN stuffed_turkey #@OUT cooked_turkey @URI plate:delicious_turkey #@END baking #@END main @URI store:shnucks_turkey @URI plate:delicious_turkey

  15. Yes Workflow echo 'Converting from Varian to NMRPipe format' var2pipe -in ./fid \ -xN 1024 -yN -xT 512 -yT -xMODE Complex -xSW 12000 -ySW -xOBS 599.5694 -yOBS -xCAR 4.772 -yCAR -xLAB 1H -yLAB -ndim 3 -aq2D -out ./data/hnco%03d.pipe -verb -ov 128 64 -yMODE 6000 -zN 64 \ -zT 32 \ States-TPPI -zSW 2000 \ 125.768 -zCAR 119 \ 13C -zLAB 15N \ States \ -zMODE Rance-Kay \ -zOBS 60.7438 \ 45 sleep 1 echo 'Transforming 3-dimensional NMR data!' echo 'Processing F3/F1 dimensions first' xyz2pipe -in data/hnco%03d.pipe -x -verb \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 64 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ | pipe2xyz -out ft/hnco%03d.ft2 y xyz2pipe -in ft/hnco%03d.ft2 -z -verb \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 32 -c 0.5 \ | nmrPipe -fn ZF -size 128 \ | nmrPipe -fn FT -neg -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ | pipe2xyz -out ft/hnco%03d.ft3 -z Website

  16. bash vs. csh the shell wars #! /bin/csh # = comment which nmrPipe tool? nmrPipe, var2pipe, xyz2pipe, etc. # My Processing Script nmrPipe -in mydata.pipe \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \ -out mydata.ft2 -ov watch for trailing spaces! Syntax? filenames and filesystems How much of the above information is NMR related? How much is related to nmrPipe or the computer we are using? 16

  17. Functionality and Function Order! #! /bin/csh # My Processing Script nmrPipe -in mydata.pipe \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \ -out mydata.ft2 -ov What procedures do we use to massage our data? What procedures do we use to transform our data? 17

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#