[University home]

High Throughput Computing using Condor

Condor Job Analysis

Although the Condor recent usage statistics provide a good overview of Condor activity they do not provide a detailed analysis of how Condor jobs are performing. It may be the case that a significant amount of CPU time is wasted by job evictions but this is not evident in the bulk statistics. This page and linked pages aim to provide users with more in-depth analysis by breaking down CPU usage into two parts: goodput and badput. Goodput is the CPU time which is actually put to good use by Condor and contributes to the solution of the problem at hand. Badput is CPU time wasted by job evictions and can be significant for long running jobs which do not use checkpointing.

By minimising badput, it is possible turn around jobs more quickly leading to higher throughput and ultimately results being produced more quickly. This also means that less electricity - and hence money - is wasted by the Condor pool PCs so that "everyone's a winner". If you find that your jobs are clocking up too much badput, then it may be a good idea to split them into shorter jobs or use checkpointing. If you are unsure of how to proceed please contact the Condor administrator Ian C. Smith (email: i.c.smith@liverpool.ac.uk) for advice.

The table below gives cummulative figures for jobs submitted in approximately the last month. You can get a breakdown of these figures by following the link corrresponding to your username. From there you can drill down to more detailed analysis by following the job links. For jobs that do not use checkpointing, all evictions are assumed to contribute to badput (in other words all of the CPU used from the current job start time up to the eviction is wasted). For checkpointing jobs, evictions are assumed not to cause any badput (however there may still be other causes). This may sometimes lead to badput being under-estimated.

The statistics were produced using the Condor Log Analyzer at University of Notre Dame. This is free to use and open to all if you wish to analyse your own log files. We cannot guarantee that this will always be 100 % accurate or will always work since the internal workings of it are known only to its authors.

Condor use for approximately the last month

Username CPU Time Goodput Badput Submitted Evicted Completed
dmhughes 3439509 1592102 1847327844086 3572818 710544
mesudell 2267066 1035788 1231198676000 5283085 604460
dlythgoe 359979 102029 257916196164 973635 110149
langfeld 44911 29424 1547517658 412631 16022
riham 126033 112567 13442116627 12714 45148
campagne 6781 2688 4093218400 374527 218400
thmel 40376 37760 2616100 3006 38
adinajwa 3927 2292 162019500 83197 16390
graeme 1057 128 9283010 164083 2010
jo91al 3410 2505 9031634 1774 1603
tonymcc 11240 10459 7656251 6867 1788
smithic 8441 8206 2319564 4184 6883
matts 3836 3796 394022 962 4008
vrtd48 561 521 376259 2500 4041
robertwi 715 715 020012 225 11010
TOTAL 6317842 2940980 3376590 2139287 10896208 1752494

All times in hours, click links for detailed analysis

Last updated: Tue Jan 30 00:54:37 GMT 2018