This article is a follow up to the previous article Estimating the Cost and Schedule of Mathematical Software. In the previous article, the author advocated using software engineering expert Barry Boehm’s Basic COCOMO Embedded Mode cost model to estimate the cost and schedule of mathematical software projects, with the important qualification that there are substantial variations between actual effort and estimated effort using this model. By Boehm’s own account, Basic COCOMO estimates are within a factor of two of actual effort only 60 percent of the time.
The formula for Basic COCOMO Embedded is:
where SM is Staff Months, the politicaly correct term formerly known as the Mythical Man Month, and KSLOC is thousand (kilo) source lines of code.
Basic COCOMO is based on a database of sixty-three software projects at TRW, Boehm’s then employer, during the 1970s. The Embedded Mode model is based on twenty-eight (28) of these projects that Boehm classified as Embedded projects. The projects were written in FORTRAN (24), COBOL (5), Jovial (5), PL/I (4), Pascal (2), Assembly (20), and miscellaneous other languages (3). None of these is commonly used today. Nonetheless, in the author’s experience, Basic COCOMO Embedded gives a rough order of magnitude (ROM) estimate of the effort for mathematical software projects such as implementing a video codec in C/C++ today (2012).
The Measurement Free Zone
Remarkably, despite the growing cost and importance of software, it is difficult to find publicly available information on the cost, schedule, and effort of software projects. There are a number of consulting firms and proprietary cost and schedule estimation tools but these do not disclose their databases of historical data. Indeed, many organizations, including many commercial businesses, do not seem to use historical data on the cost and schedule of software development to plan projects!
Donald Reifer’s 2004 Software Productivity Data
In 2004, software engineering expert Donald J. Reifer of Reifer Consultants, a colleague of Barry Boehm, published an article in The DoD SoftwareTech News, now The Journal of Software Technology, “Industry Software Cost, Quality and Productivity Benchmarks” giving the software productivity numbers, broken down by categories such as “Scientific” or “Web Business” for the most recent 600 of 1800 projects in his database of projects. These were projects from the last seven years prior to 2004 (about 1997 to 2004).
Table One below is a subset of Reifer’s data from Table 1 in his paper. These are the categories — Command and Control, Military – Airborne, Military – Ground, Military – Missile, Military – Space, and Scientific — that are similar (Command and Control, Military) or the same (Scientific) as mathematical software. The category “Web Business” is included as a point of reference.
Reifer uses equivalent source lines of code (ESLOC). For new code, ESLOC is equivalent to a line of code. For “legacy” code that is modified or reused, ESLOC applies a weighting factor to the line of code such as 0.4. This way data on maintenance or modifications of existing software can be combined with writing new software. Reifer uses equivalent source lines of code as defined by the Software Engineering Insitute.
|Application Domain Number||Projects||Size Range (KESLOC)||Avg. Productivity (ESLOC/SM)||Range (ESLOC/SM)||Example Application|
|Command & Control||45||35 to 4,500||225||95 to 350||Command centers|
|Military -All||125||15 to 2,125||145||45 to 300||See subcategories|
|Airborne||40||20 to 1,350||105||65 to 250||Embedded sensors|
|Ground||52||25 to 2,125||195||80 to 300||Combat center|
|Missile||15||22 to 125||85||52 to 175||GNC system|
|Space||18||15 to 465||90||45 to 175||Attitude control system|
|Scientific||35||28 to 790||195||130 to 360||Seismic processing|
|Web Business||65||10 to 270||275||190 to 985||Client/server sites|
|Totals||600||10 to 4,500||45 to 985|
Table 1 (Abridged): Software Productivity (ESLOC/SM) by Application Domains
Note that productivity in KESLOC (One Thousand Equivalent Source Lines of Code) is significantly higher for the Web Business category. This actually understates the difference because the “Web Business” projects, as indicated elsewhere in Reifer’s article, are usually written in so-called Fourth Generation Languages (4GLs), scripting languages such as Python, Perl, PHP, and so forth, whereas the other software categories are typically written in lower level languages such as C/C++. A single line of a 4GL language such as Python often corresponds to several lines of a language such as C/C++.
Scientific software has an average productivity of 195 ESLOC per Staff Month (SM). Note that there is a wide range of variation: 130 to 360 ESLOC per Staff Month (SM). This is for fairly large projects ranging from 28,000 lines of code to 790,000 lines of code.
Basic COCOMO Embedded predicts a productivity of 142 lines of code per Staff Month for a project with 28,000 lines of code. It predicts a productivity of 73 lines of code per Staff Month for a project with 790,000 lines of code. It predicts a productivity of about 280 lines of code per Staff Month for a project with 1,000 lines of code.
Basic COCOMO Embedded is quite similar to the numbers for Military Airborne, Missile, and Space.
Software productivity numbers are close to meaningless without an associated measure of the quality of the software. Reifer uses the number of bugs/errors/defects per thousand equivalent source lines of code (KESLOC). The error rates upon delivery to the customer show the difference between Web Business and the other categories. When the quality must be high, ideally no errors for mission critical life/death software such as airplane control software (avionics), then the number of lines of code per Staff Month is correspondingly lower.
|Application Domain||Number Projects||Error Range (Errors/KESLOC)||Normative Error Rate (Errors/KESLOC)||Notes|
|Command & Control||45||0.5 to 5||1||Command centers|
|Military — All||125||0.2 to 3||< 1.0||See subcategories|
|— Airborne||40||0.2 to 1.3||0.5||Embedded sensors|
|— Ground||52||0.5 to 4||0.8||Combat center|
|— Missile||15||0.3 to 1.5||0.5||GNC system|
|— Space||18||0.2 to 0.8||0.4||Attitude control system|
|Scientific||35||0.9 to 5||2||Seismic processing|
|Web Business||65||4 to 18||11||Client/server sites|
Table 8 (Abridged): Error Rates upon Delivery by Application Domain
Quality Requirements for Mathematical Software
The required quality for many types of mathematical software is often very high, meaning less than one error per thousand lines of code. For example, a video codec such as used by YouTube or Skype, generates the output, the video, seen and used by the customers. Almost any bug in a video codec will result in visible artifacts at best and often completely destroys the video. Many readers have probably noticed occasional blurriness or other anomalies in YouTube or other Web video; these are problems that remain after extensive debugging of the video software.
Many video, image, and audio processing applications have similar quality requirements to video codecs. Similarly, encryption and decryptions such as the Advanced Encryption Standard (AES) usually requires extremely high quality since even a single bit error will result in gibberish. Many other types of mathematical software require similarly high levels of quality. Many seem to have quality requirements in practice similar to avionics and other demanding applications modeled by Basic COCOMO Embedded.
Where Are All The Super Programmers?
It is not uncommon in verbal conversations or comments on Web blogs to encounter programmers who claim to routinely write five to ten-thousand lines of code per month. Nonetheless, Reifer’s data shows little evidence of this performance level. With some exceptions, studies of software productivity usually show much smaller numbers.
There is tremendous variation in software projects. The author once implemented the Advanced Encryption Standard (AES) in about one week. This is about 1500 lines of code. This would translate to 6000 lines of code per month if naively extrapolated. However, this was clearly unusual and stands out in the author’s memory precisely because the project went so quickly and smoothly.
It is probably possible to write many lines of working usable code for certain kinds of simple straight-forward business and user interface software. For example, the top productivity for the Web Business category in Reifer’s published data is 985 lines of code/month.
It is clear though that the average performance for the vast majority of software engineers, including most exceptional software engineers, is much less than 5000 lines of code per month for most categories of software projects, with the possible exception of some types of business and user interface software, if one requires reasonable quality.
In the author’s experience, it is common to encounter extremely optimistic ideas about the size, scope, and difficulty level of mathematical software projects. Many people appear to be genuinely unaware of how complex, how many lines of code, many commonly used examples of mathematical software such as video codecs actually are. Similarly, many people seem to be unaware of the quality level needed to produce an acceptable end-user/customer experience such as an enjoyable streaming video. Many people, even technical people who should know better, often seem to consciously or unconsciously use software productivity numbers like 5-10,000 lines of code per Staff Month even though these are not supported by most historical experience.
How should one use models like Basic COCOMO Embedded that are based on historical data or historical software productivity numbers like Donald Reifer’s data? These are good for rough order of magnitude (ROM) estimates including basic sanity checks. If one only has resources for a two week project and Basic COCOMO says the project is a six month project, one should probably reevaluate one’s plans. On the other hand if one has the resources for a six month project and Basic COCOMO says seven months, the difference is probably not meaningful given the large variation between actual effort and estimated effort. The same applies to blindly plugging in numbers like Reifer’s average 195 lines of code per Staff Month for Scientific software.
These models and data are not good for precise scheduling. There is substantial variation between actual and estimated effort. Software seems to inherently involve large variations in effort that are difficult or impossible to predict in advance.
Barry Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ, 1981
© 2012 John F. McGowan
About the Author
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing video compression and speech recognition technologies. He has extensive experience developing software in C, C++, Visual Basic, Mathematica, MATLAB, and many other programming languages. He is probably best known for his AVI Overview, an Internet FAQ (Frequently Asked Questions) on the Microsoft AVI (Audio Video Interleave) file format. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech). He can be reached at email@example.com.
Get more stuff like this
Get interesting math updates directly in your inbox.
Thank you for subscribing. Please check your email to confirm your subscription.
Something went wrong.