<
High Throughput Computing using HTCondor

Running Fortran Applications under HTCondor


All of the files for this example can be found on condor.liv.ac.uk in /opt1/condor/examples/fortran.

Contents

Introduction
Creating the Fortran 90 files
Compilation and initialisation
Creating the simplified submit description file
Running the HTCondor jobs
Combining the results
Discussion

Introduction

HTCondor is well suited to running large numbers of Fortran jobs concurrently. If the application applies the same kind of analysis to large data sets (so-called "embarrassingly parallel" applications) or carries out similar calculations based on different random initial data (e.g. applications based on Monte Carlo methods), HTCondor can significantly reduce the time needed to generate the results by processing the data in parallel on different hosts. In some cases, simulations and analyses that would have taken years on a single PC can be completed in a matter of days.

The application will need to perform three main steps:

  1. Create the initial input data and store it into files which can be processed in parallel.
  2. Process the input data in parallel using the HTCondor pool and write the outputs to corresponding files.
  3. Combine the data in the output files to generate the overall results.

As a very simple example we are going to use Monte Carlo analysis to estimate the value of pi. This is a slightly artificial example however, Monte Carlo methods are used in many practical applications and this example has the advantage of being easy to understand and fairly straightforward to program. You can find many articles on this example on the internet (two useful ones are: https://www.geeksforgeeks.org/estimating-value-pi-using-monte-carlo/ and https://academo.org/demos/estimating-pi-monte-carlo/ ) but a very brief description is given below.

Imagine a circle of radius r inscribed in a square of sides 2r and further imagine we choose points inside the square at random (a bit like throwing darts extremely badly at a dartboard). If a large number of points (N) are used then we would expect the ratio of those falling inside the circle (Nc) to those inside the square (Ns=N) to be in the same proportion as the area of the circle to the area of the square. For large values of N we can therefore calculate an approximate value of pi as 4(Nc/Ns), independently of r.

In the Fortran 90 program used below we will centre the circle on the origin and set r=1.0 hence the square has sides of length 2r=2.0 and we need to generate points with coordinates (x,y) from uniform random distributions in the range -1.0 ≤ x ≤ 1.0 and -1.0 ≤ y ≤ 1.0.

Instead of calculating the points serially in turn, the points will be calculated in batches using a number of HTCondor jobs so that the calculations are performed concurrently (i.e. at the same time). In more realistic examples, this is what will speed up the overall computation. Clearly the points will need to be chosen in a statistically independent manner for the Monte Carlo technique to work properly so we use a different random number seed for each job, these having been written to separate input files beforehand.


Creating the Fortran files

The first step is to create a program to generate the N input seed files with names of the form seed0, seed1, ... seed<N-1>. A suitable Fortran 90 program is this (initialise.f90 ) which generates 1000 input files:

    

This is rather more complicated than it needs to be because of Fortran's crude string handling capabilities and it may be easier to use a more powerful language such as Python.

        program initialise

        implicit none

        double precision x
        integer seed, i

        character(len=10) :: filename
        character(len=10) :: format_string

        do i = 0, 999
          if (i < 10) then
            format_string = "(A4,I1)"
          else if (i < 100) then
            format_string = "(A4,I2)"
          else
            format_string = "(A4,I3)"
          endif

          write (filename, format_string) "seed", i

          open(666,file=trim(filename),status='unknown')
          write(666,'(i4)') i
          close(666)
        enddo

        end
    

The next Fortran 90 progam is the one that will actually be run on the HTCondor pool to calculate how many points fall within the circle. An example implementation (pi.f90) is given below:

        program pi

        implicit none

        double precision x, y
        integer :: incircle, samplesize
        integer, allocatable :: seed(:)
        integer n, i, seed_value

        parameter(samplesize=1000)

        open(1,file='seed',action='read')
        read(1,*) seed_value

        call random_seed(size = n)
        allocate(seed(n))
        seed(1) = seed_value
        call random_seed(put=seed)

        incircle = 0
        do i = 1, samplesize
           call random_number(x)
           call random_number(y)
           x = x*2.0d0 - 1.0d0   ! generate a random point
           y = y*2.0d0 - 1.0d0   ! generate a random point
           if (x*x + y*y .lt. 1.0d0) then
              incircle = incircle+1             ! point is in the circle
           endif
        end do

        open(2,file='incircle',status='unknown')
        write(2, *) incircle
        close(2)

        end
        
This reads in a seed value from a file and initialises the random number generator with it. It then generates 1000 points and determines whether they are inside the unit circle. If so, a counter is incremented and finally the total number of points inside the circle is written to an output file. Note that the random_number() function generates random numbers in the range 0.0 ≤ x < 1.0 .

The final step is to sum all of the points falling inside the circle to calculate pi by reading the partial sums from the results files (incircle*). This can be achieved using a Fortran 90 code such as this (combine.f90):

        program combine

        implicit none

        double precision pi
        integer seed, i, incircle
        integer total_incircle

        character(len=20) :: filename
        character(len=10) :: format_string

        total_incircle = 0
        do i = 0, 999
          if (i < 10) then
            format_string = "(A8,I1)"
          else if (i < 100) then
            format_string = "(A8,I2)"
          else
            format_string = "(A8,I3)"
          endif

          write (filename, format_string) "incircle", i
          open(666,file=trim(filename),status='unknown')
          read(666,*) incircle
          close(666)
          total_incircle = total_incircle + incircle
        enddo


        pi = 4.0d0 * DBLE(total_incircle) / 1000000.0
        print '(A,F8.6)','Monte-Carlo estimate of pi: ', pi

        end

    
This reads the partial sum stored in each file in turn and sums these to calculate the approximate value of pi. Again this is uncessarily complicated because of Fortran's basic string manipulation capabilities and using a higher level language, e.g. Python, could make life easier.


Compilation and initialisation

The GNU gfortran compiler can be used to create the executables that are to be run on the HTCondor server i.e.:

$ gfortran initialise.f90 -o initialise
$ gfortran combine.f90 -o combine
It useful is to create a UNIX executable from the code to be run on the HTCondor pool as well for testing purposes:
$ gfortran pi.f90 -o pi

The GNU Fortran compiler can also generate executables to be run under Windows for use on the HTCondor pool so that is not necessary to compile the code on a PC. To compile the pi.f90 code into a Windows executable (.exe) actually on the HTCondor server use:

$ gfortran-win pi.f90 -o pi.exe

The 1000 seed files can be created using the previously built initialisation executable:

$ ./initialise
The HTCondor code can then be tested on the server with a one off seed file e.g.
$ cp seed123 seed
$ ./pi
$ cat incircle
        
It can be extremely difficult to track down errors when jobs are run on the HTCondor pool and so testing the application first on the server should be considered essential.


Creating the simplified job submission file

Each HTCondor job needs a submit description file to describe how the job should be run. These can appear rather arcane to new users and therefore to help simplify the process, tools have been created which will work with more user-friendly submit description files. These are automatically translated into files which HTCondor understands and which users need not worry about. For this example, a simplified submit description file such as the one below can be used example called run_pi:

executable = pi.exe
indexed_input_files = seed
indexed_output_files = incircle
log = log.txt
total_jobs = 1000

The Windows executable is specfied in the executable line. The lines starting indexed_input_files and indexed_output_files specify the input and output files which differ for each individual job. The total number of jobs is given in the total_jobs line. The underlying job submission processes will ensure that each individual job receives the correct input file (seed0.dat .. seed999.dat) and that the output files are indexed in a corresponding manner to the input files (e.g. output file incircle1 will correspond to seed1)

It is also possible to provide a list of non-indexed files (i.e. files that are common to all jobs), for example:

input_files = common1.dat,common2.dat,common3.dat
This is useful if the common (non-indexed) files are relatively large.


Aside:

For testing, the indexed_output_files line can be omitted so that all of the output files are returned (the default). It is useful to also capture the standard output and error (which would normally be printed to the screen) using the indexed_stdout and indexed_stderr attributes respectively. For production runs, the output files should always be specified just in case there is a run-time problem and they are not created. In this case, HTCondor will place the job in the held ('H') state. To release these jobs and run them elsewhere use:

$ condor_release -all.

To find out why jobs have been held use:

$ condor_q -held



The job submission file can be edited using the nedit or nano editors, however all of the options in it can be overidden temporarily from the command line (the file itself is not changed). For example to submit five jobs instead of 1000 use:

$ fortran_submit run_pi -total_jobs=5
and to also change the executable to be used:
$ fortran_submit product -total_jobs=5 -executable=other.exe
(use the --help option to see all of the options available)

This is useful for making small changes without the need to use the UNIX system editors to change the job submission file.

Running the HTCondor Jobs

The HTCondor jobs are submitted from the the HTCondor server using the command:

$ fortran_submit run_pi

It should return with something like:

$ fortran_submit run_pi
Submitting job(s).....................................................
......................................................................
......................................................................
1000 job(s) submitted to cluster 1394.
You can monitor the progress of all of your jobs using:
$ condor_q
Initially the HTCondor jobs will remain in the idle state until machines become available: e.g.

$ condor_q

-- Schedd: Q1@condor1 : <10.102.32.11:45062?... @ 08/06/19 10:25:35
OWNER   BATCH_NAME         SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
smithic CMD: run_pi.bat   8/6  10:23    524     97    379   1000 1394.4-999

476 jobs; 0 completed, 0 removed, 379 idle, 97 running, 0 held, 0 suspended
The overall state of the pool can be seen using the command:
$ condor_status

Combining the results

Once the jobs have completed (as indicated by the lack of any jobs in the queue), the directory should contain 1000 output files named incircle0, ... incircle999 which can be processed using the combine executable to generate the final result. e.g.:

$ ./combine
Monte-Carlo estimate of pi: 3.156000
 

Discussion

Despite generating one million points, our estimate for pi is only accurate to a couple of significant figures so this is clearly not a very efficient method. If we examine the the partial sum files, the reason for this becomes apparent:

$ cat incircle* | more
         789
         789
         789
         789
         789
         789
         789
...
All of the partial sums are exactly the same ! This means that we have essentially solved the same problem 1000 times and got the same solution so that the maximum number of significant digits we could expect is three (since there are 1000 points in each job). If we submitted one million jobs, with each generating just a single point, then we would then expect around 80 % of the jobs to have a point inside the circle. Therefore another way of looking at this would be to say the problem has only fine grained parallelism and is not suitable for HTCondor in this form.

We can do better than this by generating the random points beforehand and storing them into files. This program (randpoint.f90) does just that:

        program rand_points

        implicit none

        double precision x, y
        integer seed, iout, i, j

        character(len=10) :: filename
        character(len=10) :: format_string

        do i = 0, 999
          if (i < 10) then
            format_string = "(A6,I1)"
          else if (i < 100) then
            format_string = "(A6,I2)"
          else
            format_string = "(A6,I3)"
          endif

          write (filename, format_string) "points", i
          open(666,file=trim(filename),status='unknown')

          do j = 1,1000
            call random_number(x)
            call random_number(y)
            !write(666,'2(F8.6)') x, y
            write(666,*) x, y
          enddo

          close(iout)
        enddo

        end
The HTCondor jobs now read 1000 points from each file and calculate the number of points lying inside the circle. This code could be used for example (newpi.f90):
        program pi

        implicit none

        double precision x, y
        integer :: incircle, samplesize
        integer, allocatable :: seed(:)
        integer n, i, seed_value

        parameter(samplesize=1000)

        open(1,file='points',action='read')

        incircle = 0
        do i = 1, samplesize
           read(1,*) x, y
           print *, x, y
           x = x*2.0d0 - 1.0d0   ! generate a random point
           y = y*2.0d0 - 1.0d0   ! generate a random point
           if (x*x + y*y .lt. 1.0d0) then
              incircle = incircle+1             ! point is in the circle
           endif
        end do

        open(2,file='incircle',status='unknown')
        write(2, *) incircle
        close(2)

        end


The partial sums can be combined as before:
$ ./combine
Monte-Carlo estimate of pi: 3.140900
This is an improvement on the previous attempt and gives the same result as a serial version (the code for this in in serial.f90).

Although this is something of a toy example it does illustrate an important point - it very important when dealing with Monte Carlo methods to pay attention to the statistical properties of the random number generator(s). The first method used here is widely cited as an example on the internet (see for example this MPI version from Oak Ridge National Lab) however it is extremely inefficient and possibly fatally flawed.




last updated on Friday, 22-Sep-2023 11:09:57 BST by I.C. Smith

Research IT RSE, University of Liverpool, Liverpool L69 3BX, United Kingdom, +44 (0)151 794 2000