
Advanced Techniques for Submitting and Converting Files Efficiently
Explore advanced methods for submitting jobs and converting media files efficiently, including developing custom print formats and resolving output filename problems. Learn how to utilize tools like F functions to streamline the process and enhance productivity.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Submitting Jobs (and how to find them) John (TJ) Knoeller Center for High Throughput Computing
Overview Development of an advanced submit file Using as many techniques and tricks as possible. Custom print formats A few more random tricks (time permitting) 2
The Story I have a lot of media files that I have collected over the years. I want to convert them all to .mp4 (Sounds like a high-throughput problem ) 3
Basic submit file for conversion Executable = ffmpeg Transfer_executable = false Should_transfer_files = YES file = S1E2 The Train Job.wmv Transfer_input_files = $(file) Args = "-i '$(file)' '$(file).mp4'" Queue 4
Converting a set of files Transfer_input_files = $(file) Args = "-i '$(file)' '$(file).mp4' " Queue FILE from ( S1E1 Serenity.wmv S1E2 The Train Job.wmv S1E3 Bushwhacked.wmv S1E4 Shindig.wmv ) 5
Output filename problems Output is $(file).mp4. So output files are named S1E1 Serenity.wmv.mp4 S1E2 The Train Job.wmv.mp4 S1E3 Bushwhacked.wmv.mp4 S1E4 Shindig.wmv.mp4 6
$F() to the rescue $Fqpdnxba() expands to parts of a filename file = "./Video/Firefly/S1E4 Shindig.wmv" $Fp(file) -> ./Video/Firefly/ $Fqp(file) -> "./Video/Firefly" $Fqpa(file)-> './Video/Firefly' $Fd(file) -> Firefly/ $Fdb(file) -> Firefly $Fn(file) -> S1E4 Shindig $Fx(file) -> .wmv $Fnx(file) -> S1E4 Shindig.wmv 7
$Fn() is name without extension Transfer_Input_Files = $(file) Args = "-i '$Fnx(file)' '$Fn(file).mp4'" Resulting files are now S1E1 Serenity.mp4 S1E2 The Train Job.mp4 S1E3 Bushwhacked.mp4 S1E4 Shindig.mp4 8
$Fq() and Arguments $Fq(file) expands to quoted "filename" Gives "parse error" with Arguments statement For Args use '$F(file)' instead. Becomes 'filename' on LINUX Becomes "filename" on Windows In 8.6 you can use $Fqa(file) instead 9
"new" Args preserves spaces FILE = The Train Job.wmv Args = "-i '$Fnx(file)' -w640 '$Fn(file).mp4' " # Tool Tip* see it before you submit it. condor_submit test.sub -dump test.ads condor_status -ads test.ads -af Arguments -i The' 'Train' 'Job.wmv -w640 The' 'Train' 'Job.mp4 # On *nix the job sees -i 'The Train Job.wmv' w640 'The Train Job.mp4 # on Windows the job sees -i "The Train Job.wmv" w640 "The Train Job.mp4" 10
Sometimes you can't use Args Argument quoting not portable across operating systems LINUX needs space and ' escaped Windows needs double quotes around filenames that have space or ^ What the job sees can be hard to predict Also - Transfer_input_files will not transfer a file with a comma in the name. 11
Alternative to Args Use a script as your executable Use custom job attributes to pass information to the script # these both set CustomAttr in the job ad +CustomAttr = "value" MY.CustomAttr = "value" You can refer to custom attributes in $() expansion in your submit file transfer_input_files = $F(My.CustomAttr) 12
Add custom attributes to the job Executable = xcode.pl Args = -s 640x360 Transfer_executable = true Should_transfer_files = true # +WantIOProxy = true MY.SourceDir = $Fqp(FILE) MY.SourceFile = $Fqnx(FILE) +OutFile = "$Fn(FILE).mp4" Batch_name = $Fdb(FILE) Queue FILE matching files Firefly/*.wmv 13
Use a script to query the .job.ad #!/usr/bin/env perl # xcode.pl # Pull filenames from job ad my $src = `condor_status -ads .job.ad -af SourceFile`; my $out = `condor_status -ads .job.ad -af OutFile`; # find condor_chirp (also need +WantIOProxy in job) my $lib = `condor_config_val libexec`; chomp $src; chomp $out; chomp $lib; # fetch the input file system("$lib/condor_chirp fetch '$src' '$src'") # do the conversion system("ffmpeg -i '$src' @ARGV '$out'"); 14
Use python to query the .job.ad #!/usr/bin/env python # xcode.py import htcondor, classad, htchirp, sys, os # load the job ad and get the source filename job = classad.parseOne(open('.job.ad').read()) src = job['SourceFile'] srcpath = job['SourceDir'] + src # fetch the input file htchirp.fetch(srcpath, src) # do the conversion and return the exit code sys.exit(os.system("ffmpeg -i {0} {2} {1}".format( src, job['OutFile'], job['Arguments'])); 15
Can I use a script to create my submit file? 16
Submit using python Use the htcondor.Submit class Uses the condor_submit keywords Supports $() expansion like condor_submit Use queue_with_itemdata() optional count overrides queue arguments optional iterator of itemdata overrides queue arguments Returns a SubmitResult which contains a Classad, clusterId and range of procIds 17
Other python submit methods Submit.queue_with_itemdata() Use this one, ignore the others. Submit.queue() Ok. but ad_results can use a lot of memory Schedd.submit() and submitMany() Takes raw job classads Alters Requirements, Iwd Don t use this method! 18
Example: Submit from python # HTCondor 8.8 or later for this code sub = htcondor.Submit(open('xcode.sub').read()) # override some things we read from xcode.sub srcdir = '/media/Firefly/' sub['MY.SourceDir'] = '"%s"' % srcdir sub['FILE'] = '$(Item)' files = os.listdir(srcdir) # submit using the files list with schedd.transaction() as txn: res = sub.queue_with_itemdata(txn, 1, iter(files)) # result object has ClusterId, a "common" job classad # and the range of ProcId's 19
See how it's going.. condor_q OWNER BATCH_NAME DONE RUN IDLE TOTAL JOB_IDS Tj Firefly _ 2 2 _ 104.0-4 condor_q -af:jh JobStatus SourceFile SourceDir ID JobStatus SourceFile SourceDir 104.0 2 S1E1 Serenity Firefly/ 104.1 2 S1E2 The Train Job Firefly/ 104.2 1 S1E3 Bushwhacked Firefly/ 104.4 1 S134 Shindig Firefly/ 20
Put Queue on command line condor_submit xcod.sub -q FILE matching *.wmv (xcod.sub must NOT have a queue line) pick.py | condor_submit x.sub -q FILE from - 27
Test using a subset of jobs Queue FILE from ( S1E1 Serenity.wmv # S1E2 The Train Job.wmv # S1E3 Bushwhacked.wmv ) use a python-style slice to define a subset Queue FILE matching files [:1] *.wmv 28
Even easier if you prepare Put $(slice) in your submit file Queue FILE matching files $(slice) *.wmv Then control the slice from the command line condor_submit 'slice=[:1]' firefly.sub 29
Variable Tricks # transfer files starting with file001 sequence = $(ProcId)+1 transfer_input_files = $INT(sequence,file%03d) # Use the submit dir and cluster as the batch name batch_name = $Ffdb(SUBMIT_FILE)_$(ClusterId) # use the same random value for all jobs in this submit include command : /bin/echo rval=$RANDOM_INTEGER(1,100) Arguments = $(rval) 30
Questions? 31