Simplfied Job Submission
A number of tools have been developed locally with the aim of making HTCondor job submission more user-friendly and are described here. One difficulty users face with HTCondor is creating HTCondor submit description files (whose syntax can be fairly obscure) and editing these files using the UNIX system editors (which are not exactly known for their ease of use). These tools do not do away with the need for submit description files altogether but strive to make them easier to create and use.
General Purpose Tools
The mws_submit command can be used instead of condor_submit to submit jobs using simplified submit description files i.e.:
$ mws_submit simplified_description_file(Where simplified_description_file is the name of the "user-friendly" submit description file).
A typical submit description file might contain:
executable = myapp.exe input_files = myinput_common, other_input_common indexed_input_files = input_data, other_input_data indexed_output_files = output_data total_jobs = 10All of the attributes are optional apart from executable which must be specified (the default for total_jobs is a single job). The executable attribute specifies the main executable file to be run on a HTCondor pool PC - this will generally be a .bat file or a .exe file. The input_files attribute lists which input files are common to all jobs whilst indexed_input_files lists input files which are different for each individual job. In this example, each job will get its own input_data file from the set of input files input_data0 ... input_data9 (the same is true for other_input_data).
The indexed_output_files attribute will ensure that the output files are retrieved following the same indexing as the input files (i.e. output_data0 ... output_data9). The internal indexing/unindexing is taken care for you so there is no need to manipulate the index value inside your executable - just use the generic name (e.g. input_data).
All of the values given in the job description may be temporarily overridden from the command line (although the job description file is left unchanged). For example to change the number of submitted jobs from ten to five:$ mws_submit simplified_description_file --total_jobs=5and to also use a different executable:
$ mws_submit simplified_description_file --total_jobs=5 --executable=otherapp.exeThis makes it easy to make small changes without the need to edit the job description file. To see all of the options just use the -h option with the simplified submit description tool e.g.
$ mws_submit -h
The mws_submit command creates the job description file used by HTCondor which will have the same name as the simplified job description file but with a .sub extension. The HTCondor job description file corresponding to the above example is:
universe = vanilla executable = myapp.exe transfer_input_files = myinput_common, other_input_common, input_data$(PROCESS), other_input_data$(PROCESS) transfer_output_files = output_data$(PROCESS) should_transfer_files = YES when_to_transfer_output = ON_EXIT requirements = ( Arch=="X86_64") && ( OpSys=="WINDOWS" ) notification = never queue 10Clearly this is a good deal more complicated. Several other attributes can be specified in the simplified job description and all of these are detailed in a later section.
Tools for Submitting MATLAB Jobs
Running MATLAB jobs on the HTCondor pool is made more difficult by the need to build standalone executables to be run as jobs. The tools described below are designed to assist in testing, building and running MATLAB applications.
Since the HTCondor server has MATLAB installed, it is possible test out M-files on it. The command matlab_run can be used to pass the M-file to the MATLAB interpreter without the need for the graphical interface to be started (it is therefore suitable for use with PuTTy or other terminal emulators) for example:
$ matlab_run product.mHere product.m would need to contain a MATLAB function called product and be able to run without input from the user. MATLAB on the server should be used sparingly and not for M-files which are likely to require significant CPU use over long periods as this can impact badly on the performance of HTCondor.
In the past it was not possible to run M-files directly on the HTCondor PCs however this can now be achieved by using a special job description file e.g.
M_file = product.m indexed_input_files = input.mat indexed_output_files = output.mat total_jobs = 10This can be submitted to HTCondor using the command m_file_submit e.g.
$ m_file_submit product(Where product is the name of the job description file. Note that the total number of jobs is limited to 10 to avoid taking up too many licenses).
The command will return with job ID of the M-file job and on completion the output files output*.mat will have been created.
Once the M-file is found to work properly it should be compiled into a standalone executable using the current MATLAB version on a Windows PC - see the MATLAB applications page for more details.
Note that if the M-file contains any syntax errors, the MATLAB compiler will not catch these and will blindly compile the code into an executable which will fail when run under HTCondor. It is extremely difficult to locate these errors later on so please always check that the M-file works correctly before compiling it.
MATLAB standalone applications can be submitted to the HTCondor pool using a simplified submit description file and the command matlab_submit e.g.
$ matlab_submit simplified_job_description_fileThe job description file can make use of same attributes as mws_submit for example:
indexed_input_files = input.mat indexed_output_files = output.mat executable = product.exe indexed_log = logfile total_jobs = 10
The actual submit description file passed to HTCondor in this example is:
universe = vanilla should_transfer_files = YES when_to_transfer_output=ON_EXIT executable = standalone.bat arguments = product.exe $(PROCESS) input.mat output.mat transfer_input_files = product.exe,input$(PROCESS).mat,/opt1/condor/apps/matlab/index.exe,/opt1/condor/apps/matlab/unindex.exe transfer_output_files = output$(PROCESS).mat log = logfile$(PROCESS) request_cpus = 1 requirements = ( Arch=="X86_64") && ( OpSys=="WINDOWS" ) notification = never queue 10which again is a good deal more complicated than the simplified job description file.
Summary of Job Description Attributes
A complete list of attributes which can be used in simplified job description files for use with mws_submit and matlab_submit is given below. For attributes with multiple values, a comma separated list is used which may contain spaces, however spaces may not be used on the command line. For example
$ mws_submit -indexed_input_files=input1,input2will work but
$ mws_submit -indexed_input_files=input1, input2will not. For any of the simplified submit description tools, use the -h option to get a complete list of options. Attributes can be specified in a submit description file and/or on the command line with the latter taking precedence.
- indexed_input_files
- A comma-separated list of file names for input files unique to each job. If, for example, input.mat is given as an indexed input file, this would correspond to the set of files input0.mat .. input(n-1).mat with the ith job receiving inputi.mat as an input file.
- indexed_log
- Similar to the log attribute (below) but different log files are used for each job making it easier to track down information.
- indexed_output_files
- A comma-separated list of file names for output files unique to each job. If, for example, output.mat is given as an indexed output file, this would correspond to the set of files output0.mat .. output(n-1).mat with the ith job producing outputi.mat as an output file.
- indexed_stdout
- File to which each individual job's standard output is to be redirected. The file names will be indexed in a similar manner to the indexed input/output files so that the standard output of each individual job can be seen. This can sometimes be useful in determining where things have gone wrong.
- indexed_stderr
- File to which each individual job's standard error stream is to be redirected. The file names will be indexed in a similar manner to the indexed input/output files so that the standard error of each individual job can be seen. This can sometimes be useful in determining where things have gone wrong.
- input_files
- Comma-separated list of input files common to all jobs
- log
- File to which HTCondor logs information about the progress of jobs. For multiple jobs, all of the log information is merged into one file and a better choice may be to use indexed_log. This can be useful in determining where and for how long jobs ran.
- runtime
- The maximum time (in minutes) that a job will be allowed to run for. After this time has elapsed, the job will be held then released causing it to go back into the HTCondor queue. This is useful to prevent jobs getting "stuck".
- memory
- This attribute can be used so ensure that jobs run only on machines with at least a given amount of memory. The memory size is specified in GB so that memory = 1 would ensure that jobs run only on PCs with at least 1 GB of memory in total (not per core).
- stdout
- File to which the job's standard output is to be redirected. This is only really useful for single jobs - for multiple parallel jobs use indexed_stdout
- stderr
- File to which the job's standard error stream is to be redirected. This is only really useful for single jobs - for multiple parallel jobs use indexed_stdout
Summary of Commands
- matlab_run M-file
- Will run a M-file using MATLAB on the HTCondor server without the need to start the graphical interface.
- matlab_submit simplified_submit_description_file
- Submits a standalone MATLAB executable to the pool. The executable does not
need to manipulate the input and output filenames to give the correct indexes.
- m_file_submit simplified_submit_description file
- Uses a HTCondor pool PC to run the specified M-file. A submit description file needs to be supplied which contains (at a minimum) the name of the M-file and the input file it reads.
- mws_submit simplified_submit_description_file
- Submits a generic HTCondor job to the pool using a simplified submit description file. Users' applications must ensure that all file indexing is taken care of.