High Throughput Computing using Condor

Condor Case Studies

The following examples illustrate just a few projects which have benefited from using the ARC Condor High Throughput Computing Service.

Simulation of a Randomised Clinical Trial into lowering Cholestrol

I used Condor to simulate a randomised clinical trial for the cholesterol lowering drug simvastatin. Using my own MATLAB code, I generated virtual patients who then had their cholesterol simulated using both a standard dosing protocol twenty times and a new dosing protocol twenty times. With ten patients the amount of jobs submitted to Condor totalled around 400, the standard dosing protocol simulations took around 15 minutes and the new dosing protocol simulations took around an hour and a half.

Ben Francis, Institute of Translational Medicine

Mathematical Modelling of Disease Transmission in Animal Herds

I have used Condor to model the transmission of E. coli O157 between individual animals in a herd and the transmission of bluetongue virus between individual farms in Suffolk and Norfolk. I write my own Matlab code and use Condor to run multiple simulations.

E. coli O157 [1]: Here I needed to run lots of simulations for each parameter set/scenario, so I ran 1500 jobs with 20 simulations per job.

Bluetongue virus [2]: In this case, a single simulation could take up to an hour to run and used a lot of Matlab's memory, so I ran 100 jobs with 1 simulation per job.


[1] Turner J, Bowers RG, Clancy D, Behnke MC, Christley RM (2008) A network model of E. coli O157 transmission within a typical UK dairy herd: The effect of heterogeneity and clustering on the prevalence of infection. Journal of Theoretical Biology 254, 45–54 (doi:10.1016/j.jtbi.2008.05.007).

[2] Turner J, Bowers RG, Baylis M (2012) Modelling bluetongue virus transmission between farms using animal and vector movements. Scientific Reports 2:319 (DOI:10.1038/srep00319).

Dr Joanne Turner, Department of Epidemiology and Public Health

Simulation of Radiotherapy Treatment using Monte Carlo Methods

The EGSnrc-BEAMnrc-DOSXYZnrc Monte Carlo analysis software (a suite of open source code developed by the National Research Council, Canada and written using the MORTRAN extension of FORTRAN) is used for radiotherapy treatment planning with DICOM computed tomography (CT) datasets. A typical simulation which contains billions of particle histories is split into 5000 parallel jobs. This code is used for a PhD project at the moment but will be expanded to clinical use as a Quality Assurance (QA) tool in the near future (in collaboration with Clatterbridge Cancer Centre, Wirral).

Mekala Chandrasekaran, Clatterbridge Centre for Oncology

Classification of Complex Signals/Images using Genetic Programming

We use Condor to evolve a set of potential cost functions that are used as projection indices for feature extraction in classification problems. The evolution process was implemented via genetic programming (GP), using cross-validation as fitness function. At the i-th iteration, also known as i-th generation, the GP creates a new population by mixing the chromosomes of the existing population via genetic operators like crossover or mutation. The fitness value of each offspring is then evaluated and the best performing ones are allowed to survive for the next generation. We use Condor to compute the fitness of each offspring, which is executed as a single job. A typical population consists of 100 individuals/jobs which are evolved for 20 generations. This system was implemented using MATLAB with checkpointing enabled.

Keywords: machine learning, classification, feature extraction, projection pursuit, model selection, genetic programming.

Eduardo Rodriguez Martinez, Department of Electrical Engineering and Electronics

Simulation of Disease Transmission in the Poulty and Aquaculture Industries

Initial work concentrated on analysing the effect of incursion of H5N1 avian influenza into UK poultry flocks [1]. This was performed by running large numbers (on the order of thousands) of simulations written using our own MATLAB code. Condor provided an ideal platform for this to be achieved in a reasonable time frame. Subsequent work has shifted to a similar analysis of the aquaculture industry in England and Wales [2]


[1] Kieran J. Sharkey, Roger G Bowers, Kenton L. Morgan, Susan E. Robinson and Robert M. Christley, Epidemiological consequences of an incursion of highly pathogenic H15N1 avian influenza into the British poultry flock , Proc. R. Soc. B (2008) 275, 19-28 (doi:10.1098/rsbp.2007.1100)

[2] A.R.T. Jonkers, K. J. Sharkey, M. A. Thrush, J. F. Turnbull and K. L. Morgan, Epidemics and control strategies for diseases of farmed salmonids: A parameter study, Epidemics, v. 2, issue 4 (December 2010), pp 195-206 (doi:10:1016/j.epidem.2010.08.001)

Dr Kieran Sharkey, Department of Mathematical Sciences