[University home]

High Throughput Computing using Condor

Condor Job Analysis

Although the Condor recent usage statistics provide a good overview of Condor activity they do not provide a detailed analysis of how Condor jobs are performing. It may be the case that a significant amount of CPU time is wasted by job evictions but this is not evident in the bulk statistics. This page and linked pages aim to provide users with more in-depth analysis by breaking down CPU usage into two parts: goodput and badput. Goodput is the CPU time which is actually put to good use by Condor and contributes to the solution of the problem at hand. Badput is CPU time wasted by job evictions and can be significant for long running jobs which do not use checkpointing.

By minimising badput, it is possible turn around jobs more quickly leading to higher throughput and ultimately results being produced more quickly. This also means that less electricity - and hence money - is wasted by the Condor pool PCs so that "everyone's a winner". If you find that your jobs are clocking up too much badput, then it may be a good idea to split them into shorter jobs or use checkpointing. If you are unsure of how to proceed please contact the Condor administrator Ian C. Smith (email: i.c.smith@liverpool.ac.uk) for advice.

The table below gives cummulative figures for jobs submitted in approximately the last month. You can get a breakdown of these figures by following the link corrresponding to your username. From there you can drill down to more detailed analysis by following the job links. For jobs that do not use checkpointing, all evictions are assumed to contribute to badput (in other words all of the CPU used from the current job start time up to the eviction is wasted). For checkpointing jobs, evictions are assumed not to cause any badput (however there may still be other causes). This may sometimes lead to badput being under-estimated.

The statistics were produced using the Condor Log Analyzer at University of Notre Dame. This is free to use and open to all if you wish to analyse your own log files. We cannot guarantee that this will always be 100 % accurate or will always work since the internal workings of it are known only to its authors.

Condor use for approximately the last month

Username CPU Time Goodput Badput Submitted Evicted Completed
mesudell 2147338 920756 1226554478000 5252212 467509
dmhughes 1089940 507679 582238171112 302183 142338
dlythgoe 345001 89833 255135192164 967476 106358
langfeld 35584 20163 1541213342 402376 12674
riham 123718 110262 13434114100 11830 42871
campagne 6781 2688 4093218400 374527 218400
thmel 40376 37760 2616100 3006 38
adinajwa 3927 2292 162019500 83197 16390
graeme 1057 128 9283010 164083 2010
jo91al 3410 2505 9031634 1774 1603
smithic 8441 8206 2319552 4180 6881
tonymcc 1657 1574 771217 757 697
matts 108 108 01018 103 1006
robertwi 715 715 020012 225 11010
TOTAL 3808053 1704669 2103241 1243161 7567929 1029785

All times in hours, click links for detailed analysis

Last updated: Fri Nov 24 00:45:25 GMT 2017