Subcribe

The Return of the Mythical Man Month


Underestimation of the cost, schedule, and risk of projects is common in software development and especially prevalent in mathematical software development. It is common to encounter extremely optimistic ideas about the duration and difficulty level of mathematical software projects, ironically one of the more difficult kinds of software development, as well as magical ideas about mathematics.

There is relatively little publicly available information on the scope and difficulty level of software projects of any kind. Some information is available in books and papers by various self-styled software engineering experts such as Barry Boehm, Donald Reifer, Capers Jones, and several others. These experts usually have consulting businesses and do not disclose their raw data and make limited disclosures of the results of analyses of their data.

Open source software projects can provide an excellent source of information on some aspects, such as the number of lines of computer code, of various software and mathematical software projects. This information can be independently verified by downloading the source code of an open source project and examining it, using tools like the CLOC utility to count the lines of code if needed.

Unfortunately, it is difficult to get accurate estimates of the actual effort expended on an open source project. It is difficult to verify if a contributor worked part-time, full time, or more than full time on the project. Some contributors may not be credited.

This article examines a data set of ninety-three NASA projects between the years 1971-1987 that was collected by Jairus Hihn of the NASA Jet Propulsion Laboratory (JPL). The data is the NASA 93 data set from the PROMISE Software Engineering Repository at the University of Ottawa.

The data lists the number of source lines of code (SLOC) for each project, the actual effort expended in staff months (SM), and classified the projects according to software engineering expert Barry Boehm’s COCOMO I (Constructive Cost Model). The data used for Boehm’s COCOMO I model is also available as a data set in the PROMISE repository.

A Note on Lines of Code

Lines of code is a very imperfect measure of the size and scope of a software project. For example, these are both one line of code in the C Programming Language:

a = 1;

and

a = (1.0/sqrt(2.0*M_PI))*exp(-(x - mean)*(x-mean)/(sigma*sigma));

There are several different definitions of lines of code used in the literature on software cost and schedule estimation. In additional, there are a range of alternatives that have been proposed to lines of codes, such as function points (currently popular).

Nonetheless, lines of code are somewhat reminiscent of Winston Churchill’s quote about democracy:


It has been said that democracy is the worst form of government except all the others that have been tried.

Function points were developed for business applications and rely heavily on counting the number of inputs and outputs to a program. This often works well for business applications where the applications are often relatively simple and the complexity scales with the number of inputs and outputs. Mathematical software such as video codecs often have few inputs (one compressed file or data stream) and outputs (uncompressed video) but a very complex internal implementation (tens of thousands of lines of code). This has been recognized as a weakness of function points for some time and there are some variations such as so-called “feature points” that attempt to address this problem.

Further, methods like function points require substantial training and study to measure and learn to use. They are not relatively intuitive like lines of code. There is much more data on software projects available in lines of code than function points.

One good way to think about lines of code is that each line of code is like a single moving part in a complex machine like a grandfather clock. Some parts are simple like the first line of code above. Some parts are more complex like the second line of code above. In general, lines of code would correspond to moving parts if one tried to implement a computer program as a mechanical device like Victorian era English mathematician Charles Babbage’s steam driven difference engine.

In mathematical software such as video compression, speech recognition, or other advanced applications, a line of code is usually directly equivalent to a single line of a mathematical formula or equation that a math teacher or professor might write on a blackboard or dry erase board in class. Most examples of mathematics taught in high school or college math courses cover at most a dozen blackboards. These are often building blocks of the mathematical solutions to real-world problems or cutting edge research problems. Most real-world examples of mathematical software such as video codecs such as the H.264, Flash, or Microsoft Silverlight video compression used by web sites today are many thousands of lines of code and correspond to hundreds or thousands of blackboards filled with mathematical equations and formulas.

Analysis of the NASA 93 Data

The plots below show various aspects of the NASA 93 data on the scope and effort of these software projects.

NASA 93 RAW DATA

NASA 93 Software Projects

Project Size in Staff Years (Man Years)

Project Size in Staff Years (Man Years)

Project Size in Thousands of Lines of Code

Project Size in Thousands of Lines of Code

Project Years

Project Years

NASA 93 Shown by COCOMO Mode

NASA 93 Shown by COCOMO Mode

The COCOMO model divides software projects into three general categories or “modes”. These are the embedded, semi-detached, and organic. Embedded mode projects such as flight avionics software are most similar in difficulty to mathematical software projects. Indeed, due to safety issues, flight avionics software can be more demanding, requiring higher quality, than commercial applications such as video compression for entertainment. The software productivity in lines of code per staff month is now shown for the three kinds of projects.

Software Productivity for Organic Mode Projects (NASA 93)

Software Productivity for Organic Mode Projects (NASA 93)

Software Productivity for Semi-Detached Projects (NASA 93)

Software Productivity for Semi-Detached Projects (NASA 93)

Software Productivity for Embedded Projects (NASA 93)

Software Productivity for Embedded Projects (NASA 93)

The next plot compares the NASA 93 data to Barry Boehm’s Basic COCOMO I model for Embedded Projects (red line) and to a linear fit to the NASA 93 data (green line). As can be seen, there is considerable variation between actual and estimated effort, although the models are on average roughly correct and usually within a factor of three of actual effort.

Comparison of Data to Fitted Models

Comparison of Data to Fitted Models

The final plot shows the relative error between the actual effort and the estimated effort using the fitted model.

Relative Error (NASA 93)

Relative Error (NASA 93)

Conclusion

On average, the software productivity for demanding software applications such as embedded aerospace applications tends to be quite low, in the range of two-hundred (200) lines of code per staff month (mythical man month). However, there is wide variation between actual and estimated effort. The highest productivity (defined as lines of code per staff month) among the embedded projects in the NASA 93 data set was about 700 lines of code per month, and the lowest around 50 lines of code per month. Given the difficulties in defining lines of code and measuring the quality of the delivered software, it is impossible to evaluate the significance of these variations without more detailed information on the projects.

It is important to keep in mind that numbers like two-hundred lines of code per staff month do not refer to just typing two-hundred lines of code which can take as little as a few minutes. They refer to the entire software development process, usually including requirements analysis, software design, actual coding, and especially debugging to achieve the high levels of quality required for these applications.

There are several cases where a single error in a single line of mathematical software has resulted in the loss of a multi-million dollar mission or human lives. The loss of the Mariner I probe to Mars is frequently attributed to a small error in copying a mathematical formula into the probe’s computer software. In 1991 a subtle error in the mathematical software for a PATRIOT missile system resulted in an Iraqi SCUD missile penetrating to a US base in Dahran, Saudi Arabia and killing 28 soldiers. On June 4, 1995 the European Space Agency’s first launch of the new Ariane 5 rocket exploded due to an error converting a 64 bit floating point number incorrectly to a 16 bit integer number in software. The loss of NASA’s Mars Climate Orbiter (MCO) in 1999 has been attributed to an incorrect conversion between English units (foot-pounds) and metric units (meters-Newtons). Aviation and rocketry have especially demanding requirements for the quality of software.

While commercial applications of mathematical software such as video compression for entertainment are not always as demanding as mission-critical aerospace software, they can still be quite demanding. Viewers of compressed video such as Netflix, YouTube, BluRay, or DVD video have a pretty limited tolerance for visible artifacts and errors in the video. Almost any error in the implementation of a video codec can introduce visible artifacts or errors, so the codecs must, in general, achieve very high levels of quality, though not necessarily perfect.

Credits

Sayyad Shirabad, J. and Menzies, T.J. (2005) The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada . Available: http://promise.site.uottawa.ca/SERepository

© 2012 John F. McGowan

About the Author

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing video compression and speech recognition technologies. He has extensive experience developing software in C, C++, Visual Basic, Mathematica, MATLAB, and many other programming languages. He is probably best known for his AVI Overview, an Internet FAQ (Frequently Asked Questions) on the Microsoft AVI (Audio Video Interleave) file format. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech). He can be reached at jmcgowan11@earthlink.net.


Appendix I: Source Code for Analysis

The analysis was performed using a program written in the free open source Octave numerical programming environment which is mostly compatible with MATLAB. Here is the code. It generates additional plots beyond the ones highlighted in the body of this article. The raw data file nasa93_raw_data.txt, which is extracted from the PROMISE data file follows.

% Analysis of NASA 93 software effort data
%
% (C) 2012 John F. McGowan, Ph.D.
% E-Mail: jmcgowan11@earthlink.neto
%

data93 = dlmread('nasa93_raw_data.txt');

% COCOMO (Barry Boehm's Constructive Cost Model) MODE CODES (1=ORGANIC, 2=SEMI-DETACHED, 3=EMBEDDED)

[e_row, e_col] = find(data93(:,7) == 3);
[semi_row, semi_col] = find(data93(:,7) == 2);
[org_row, org_col] = find(data93(:,7) == 1);


actuals = data93(:,end-1:end);

ksloc = actuals(:,1); % thousand (kilo) source lines of code
staff_months = actuals(:,2); % also known as man month, work month, person month


printf('making figure 1\n');
fflush(stdout);

figure(1);
loglog(ksloc, staff_months, 'o');
title('NASA 93 SOFTWARE PROJECT DATA');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Staff Months (SM)');
print('nasa93_raw_data.jpg');


logloc = log10(ksloc);
log_staff_months = log10(staff_months);

[p_nasa93, s_nasa93] = polyfit(logloc, log_staff_months, 1); % fit polynomial model to the data

pred_logloc = polyval(p_nasa93, logloc);

delta = 10.^pred_logloc - staff_months; % difference between predicted staff months and actual staff months

relative_error = delta ./ staff_months; % (Estimated Staff Months - Actual Staff Months)/Actual Staff Months

cocomo_x = 1:10:max(ksloc(:));
y = polyval(p_nasa93, log10(cocomo_x));

cocomo_org = 2.4 * (cocomo_x).^1.05; % Barry Boehm's Basic COCOMO 81 (Organic) model
cocomo_semi = 3.0 * (cocomo_x).^1.12; % Barry Boehm's Basic COCOMO 81 (Semi-detached) model
cocomo_e = 3.6 * (cocomo_x).^1.2; % Barry Boehm's Basic COCOMO 81 (Embedded) model

printf('making figure 2\n');
fflush(stdout);

figure(2);
% loglog(ksloc, staff_months, 'o', ksloc, 10.^pred_logloc, '*');
loglog(ksloc, staff_months, 'o', cocomo_x, 10.^y, '-', "linewidth", 3, cocomo_x, cocomo_e, 'r-', "linewidth", 3);
title('FIT TO NASA 93 SOFTWARE PROJECT DATA');
xlabel('Thousands of Lines of Code (KSLOC)'); % thousand source lines of code
ylabel('Staff Months (SM)'); % staff month
legend("NASA 93 DATA", "FIT 93", "COCOMO 81 (EMBEDDED)");
print('nasa93_fit.jpg');


A = 10.^p_nasa93(2);
B = p_nasa93(1);

x = 1:100:5000;
x = x / 1000.0;
y = A.*(x.^B);

printf('making figure 3\n');
fflush(stdout);


figure(3);
%plot(x,y);
hist(relative_error, 20);
title('Relative Error of Estimates');
xlabel('(Estimated Staff Months - Actual Staff Months)/Actual Staff Months');
ylabel('Number of Projects');
print('nasa93_relative_error.jpg');


max_ksloc = max(ksloc(:));
mean_ksloc = mean(ksloc(:));
min_ksloc = min(ksloc(:));

max_mm = max(staff_months(:));
mean_mm = mean(staff_months(:));
min_mm = min(staff_months(:));

mean_are = mean(abs(relative_error(:))); % known as MMRE Mean Magnitude of Relative Error
max_are = max(abs(relative_error(:)));
min_are = min(abs(relative_error(:)));

prod = 1000.0*ksloc ./ staff_months;

max_prod = max(prod(:));
mean_prod = mean(prod(:));
median_prod = median(prod(:));
min_prod = min(prod(:));
std_prod = std(prod(:)); % standard deviation of software productivity

printf('making figure 4\n');
fflush(stdout);

figure(4);
hist(prod, 20);
title('Software Productivity of NASA 93 Projects');
xlabel('Lines of Code per Staff Month (SLOC/SM)');
ylabel('Number of Projects');
print('nasa93_prod.jpg');


% PRED(30) is number of actuals within 30% of predicted value

ind = find(abs(relative_error(:) <= 0.3));
pred30 = numel(ind);

% gaussian/normal point of reference

g_data = randn(1,93);
mean_g = mean(g_data(:));
std_g = std(g_data(:));
skewness_g = skewness(g_data(:));
kurtosis_g = kurtosis(g_data(:)); % technically the kurtosis in Octae is the "excess kurtosis" which is defined so the kurtosis of the Normal distribution has an expected value of zero

mean_re = mean(relative_error(:));
std_re = std(relative_error(:));
skewness_re = skewness(relative_error(:));
kurtosis_re = kurtosis(relative_error(:));

printf('making figure 5\n');
fflush(stdout);

figure(5)
hist(g_data*std_re + mean_re, 20);
title('Normal Distribution Data');
ylabel('Number Samples');
xlabel('Scaled Relative Error');
print('nasa93_scaled_normal.jpg'); % figure 5 as JPEG


% display the distribution of the kurtosis of the normal distribution

fflush(stdout);
printf("computing kurtosis of normal distribution\n");
fflush(stdout);

g_data_k = randn(10000, 93); % 100 test sets of 93 samples
g_kurtosis = kurtosis(g_data_k,2);

figure(6);
hist(g_kurtosis, 20);
title('Excess Kurtosis of Normal Distribution');
xlabel('Kurtosis');
ylabel('Number of Test Sets');
print('normal_kurtosis_distribution.jpg');


g_skewness = skewness(g_data_k, 2);

printf('making figure 7\n');
fflush(stdout);

figure(7)
hist(g_skewness, 20);
title('Skewness of Normal Distribution');
xlabel('Skewness');
ylabel('Number of Test Sets');
print('normal_skewness_distribution.jpg');



% tails

x = -10.0:0.1:10.0;
y = (1.0/sqrt(2*pi))*exp(-x.^2/2.0);

printf('making figure 8\n');
fflush(stdout);

figure(8)
plot(x,y,'-', 'linewidth', 3);
title('Normal Distribution (Thin Tails)');
print('normal.jpg');


y_cauchy = 1.0./(1.0 + x.^2);
norm_cauchy = 0.1*sum(y_cauchy);
y_cauchy = y_cauchy ./ norm_cauchy;
figure(9)
plot(x,y_cauchy,'-', 'linewidth', 3);
title('Cauchy Distribution (Fat Tails)');
print('cauchy.jpg');

printf('making figure 10\n');
fflush(stdout);

figure(10);
plot(x, y, '-', 'linewidth', 3, x, y_cauchy, '-g', 'linewidth', 3);
title('Normal and Cauchy Distributions Together');
legend('Normal', 'Cauchy');
legend('boxon'); % put box around legend
print('normal_cauchy.jpg');

year = data93(:,6); % year of project
printf('making figure 11\n');
fflush(stdout);

figure(11);
years = 1970:1990;
hist(year, years);
title('NASA 93 Software Projects by Year');
xlabel('Year');
ylabel('Number of Projects');
print('project_years.jpg');

printf('making figure 12\n');
fflush(stdout);

figure(12)
hist(ksloc, 50);
title('Size of NASA 93 Software Projects');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Number of Projects');
print('project_size_ksloc.jpg');

printf('making figure 13\n');
fflush(stdout);

figure(13)
hist(staff_months, 50);
title('Size of NASA 93 Software Projects');
xlabel('Staff Months');
ylabel('Number of Projects');
print('project_size_sm.jpg');

printf('making figure 14\n');
fflush(stdout);

figure(14)
staff_years = staff_months / 12.; % convert to mythical man year/staff year
hist(staff_years, 50);
title('Size of NASA 93 Software Projects');
xlabel('Staff Years');
ylabel('Number of Projects');
print('project_size_sy.jpg');

% plots for different COCOMO Modes

printf('making figure 15\n');
fflush(stdout);

figure(15)
loglog(ksloc(e_row), staff_months(e_row), 'o', cocomo_x, cocomo_e, 'r-');
title('NASA 93 DATA (EMBEDDED PROJECTS ONLY)');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Staff Months (SM)');
legend('Embedded Data', 'Embedded Model', 'location', 'northwest');
legend("boxon");
print('nasa93_embedded_data.jpg');
% largest effort project is embedded as might expect

printf('making figure 16\n');
fflush(stdout);

figure(16)
loglog(ksloc(semi_row), staff_months(semi_row), 'o', cocomo_x, cocomo_semi, 'r-');
title('NASA 93 DATA (SEMI-DETACHED PROJECTS ONLY)');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Staff Months (SM)');
legend('Semi Detached Data', 'Semi Detached Model', 'location', 'northwest');
legend("boxon");
print('nasa93_semi_data.jpg');
% largest size (KSLOC) project is semi-detached

printf('making figure 17\n');
fflush(stdout);

figure(17)
loglog(ksloc(org_row), staff_months(org_row), 'o', cocomo_x, cocomo_org, 'r-');
title('NASA 93 DATA (ORGANIC PROJECTS ONLY)');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Staff Months (SM)');
legend('Organic Data', 'Organic Model', 'location', 'northwest');
legend("boxon");
print('nasa93_org_data.jpg');

printf('making figure 18\n');
fflush(stdout);

figure(18)
loglog(ksloc(org_row), staff_months(org_row), '*k', ksloc(semi_row), staff_months(semi_row), 'ob', ksloc(e_row), staff_months(e_row),'or');
title('NASA 93 DATA (ALL PROJECTS)');
xlabel('Thousands of Lines of Code (KSLOC)');
ylabel('Staff Months (SM)');
legend('Organic', 'Semi-Detached', 'Embedded', "location", "northwest");
legend("boxon");
print('nasa93_by_mode_data.jpg');

% productivity by cocomo mode

printf('making figure 19\n');
fflush(stdout);

figure(19);
hist(prod(org_row), 20);
title('Software Productivity Organic Mode');
xlabel('Lines of Code per Staff Month (SLOC/SM)');
ylabel('Number of Projects');
print('nasa93_prod_org.jpg');

printf('making figure 20\n');
fflush(stdout);

figure(20);
hist(prod(semi_row), 20);
title('Software Productivity Semi Detached Mode');
xlabel('Lines of Code per Staff Month (SLOC/SM)');
ylabel('Number of Projects');
print('nasa93_prod_semi.jpg');

printf('making figure 21\n');
fflush(stdout);

figure(21);
hist(prod(e_row), 20);
title('Software Productivity Embedded Mode');
xlabel('Lines of Code per Staff Month (SLOC/SM)');
ylabel('Number of Projects');
print('nasa93_prod_embedded.jpg');


printf("ALL DONE\n");
fflush(stdout);


nasa93_raw_data.txt

1,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,25.9,117.6
2,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,24.6,117.6
3,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,7.7,31.2
4,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,8.2,36
5,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,9.7,25.2
6,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,2.2,8.4
7,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,3.5,10.8
8,erb,avionicsmonitoring,g,2,1982,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,66.6,352.8
9,gal,missionplanning,g,1,1980,2,h,l,h,xh,xh,l,h,h,h,h,n,h,h,h,n,7.5,72
10,gal,missionplanning,g,1,1980,2,n,l,h,n,n,l,l,h,vh,vh,n,h,n,n,n,20,72
11,gal,missionplanning,g,1,1984,2,n,l,h,n,n,l,l,h,vh,h,n,h,n,n,n,6,24
12,gal,missionplanning,g,1,1980,2,n,l,h,n,n,l,l,h,vh,vh,n,h,n,n,n,100,360
13,gal,missionplanning,g,1,1985,2,n,l,h,n,n,l,l,h,vh,n,n,l,n,n,n,11.3,36
14,gal,missionplanning,g,1,1980,2,n,l,h,n,n,h,l,h,h,h,l,vl,n,n,n,100,215
15,gal,missionplanning,g,1,1983,2,n,l,h,n,n,l,l,h,vh,h,n,h,n,n,n,20,48
16,gal,missionplanning,g,1,1982,2,n,l,h,n,n,l,l,h,n,n,n,vl,n,n,n,100,360
17,gal,missionplanning,g,1,1980,2,n,l,h,n,xh,l,l,h,vh,vh,n,h,n,n,n,150,324
18,gal,missionplanning,g,1,1984,2,n,l,h,n,n,l,l,h,h,h,n,h,n,n,n,31.5,60
19,gal,missionplanning,g,1,1983,2,n,l,h,n,n,l,l,h,vh,h,n,h,n,n,n,15,48
20,gal,missionplanning,g,1,1984,2,n,l,h,n,xh,l,l,h,h,n,n,h,n,n,n,32.5,60
21,X,avionicsmonitoring,g,2,1985,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,19.7,60
22,X,avionicsmonitoring,g,2,1985,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,66.6,300
23,X,simulation,g,2,1985,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,29.5,120
24,X,monitor_control,g,2,1986,2,h,n,n,h,n,n,n,n,h,h,n,n,n,n,n,15,90
25,X,monitor_control,g,2,1986,2,h,n,h,n,n,n,n,n,h,h,n,n,n,n,n,38,210
26,X,monitor_control,g,2,1986,2,n,n,n,n,n,n,n,n,h,h,n,n,n,n,n,10,48
27,X,realdataprocessing,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,15.4,70
28,X,realdataprocessing,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,48.5,239
29,X,realdataprocessing,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,16.3,82
30,X,communications,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,12.8,62
31,X,batchdataprocessing,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,32.6,170
32,X,datacapture,g,2,1982,2,n,vh,h,vh,vh,l,h,vh,h,n,l,h,vh,vh,l,35.5,192
33,X,missionplanning,g,2,1985,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,5.5,18
34,X,avionicsmonitoring,g,2,1987,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,10.4,50
35,X,avionicsmonitoring,g,2,1987,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,14,60
36,X,monitor_control,g,2,1986,2,h,n,h,n,n,n,n,n,n,n,n,n,n,n,n,6.5,42
37,X,monitor_control,g,2,1986,2,n,n,h,n,n,n,n,n,n,n,n,n,n,n,n,13,60
38,X,monitor_control,g,2,1986,2,n,n,h,n,n,n,n,n,n,h,n,h,h,h,n,90,444
39,X,monitor_control,g,2,1986,2,n,n,h,n,n,n,n,n,n,n,n,n,n,n,n,8,42
40,X,monitor_control,g,2,1986,2,n,n,h,h,n,n,n,n,n,n,n,n,n,n,n,16,114
41,hst,datacapture,g,2,1980,2,n,h,h,vh,h,l,h,h,n,h,l,h,h,n,l,177.9,1248
42,slp,launchprocessing,g,6,1975,2,h,l,h,n,n,l,l,n,n,h,n,n,h,vl,n,302,2400
43,Y,application_ground,g,5,1982,2,n,h,l,n,n,h,n,h,h,n,n,n,h,h,n,282.1,1368
44,Y,application_ground,g,5,1982,2,h,h,l,n,n,n,h,h,h,n,n,n,h,n,n,284.7,973
45,Y,avionicsmonitoring,g,5,1982,2,h,h,n,n,n,l,l,n,h,h,n,h,n,n,n,79,400
46,Y,avionicsmonitoring,g,5,1977,2,l,n,n,n,n,l,l,h,h,vh,n,h,l,l,h,423,2400
47,Y,missionplanning,g,5,1977,2,n,n,n,n,n,l,n,h,vh,vh,l,h,h,n,n,190,420
48,Y,missionplanning,g,5,1984,2,n,n,h,n,h,n,n,h,h,n,n,h,h,n,h,47.5,252
49,Y,missionplanning,g,5,1980,2,vh,n,xh,h,h,l,l,n,h,n,n,n,l,h,n,21,107
50,Y,simulation,g,5,1983,2,n,h,h,vh,n,n,h,h,h,h,n,h,l,l,h,78,571.4
51,Y,simulation,g,5,1984,2,n,h,h,vh,n,n,h,h,h,h,n,h,l,l,h,11.4,98.8
52,Y,simulation,g,5,1985,2,n,h,h,vh,n,n,h,h,h,h,n,h,l,l,h,19.3,155
53,Y,missionplanning,g,5,1979,2,h,n,vh,h,h,l,h,h,n,n,h,h,l,vh,h,101,750
54,Y,missionplanning,g,5,1979,2,h,n,h,h,h,l,h,n,h,n,n,n,l,vh,n,219,2120
55,Y,utility,g,5,1979,2,h,n,h,h,h,l,h,n,h,n,n,n,l,vh,n,50,370
56,spl,datacapture,g,2,1979,2,vh,h,h,vh,vh,n,n,vh,vh,vh,n,h,h,h,l,227,1181
57,spl,batchdataprocessing,g,2,1977,2,n,h,vh,n,n,l,n,h,n,vh,l,n,h,n,l,70,278
58,de,avionicsmonitoring,g,2,1979,2,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,0.9,8.4
59,slp,operatingsystem,g,6,1974,2,vh,l,xh,xh,vh,l,l,h,vh,h,vl,h,vl,vl,h,980,4560
60,slp,operatingsystem,g,6,1975,3,n,l,h,n,n,l,l,vh,n,vh,h,h,n,l,n,350,720
61,Y,operatingsystem,g,5,1976,3,h,n,xh,h,h,l,l,h,n,n,h,h,h,h,n,70,458
62,Y,utility,g,5,1979,3,h,n,xh,h,h,l,l,h,n,n,h,h,h,h,n,271,2460
63,Y,avionicsmonitoring,g,5,1971,1,n,n,n,n,n,l,l,h,h,h,n,h,n,l,n,90,162
64,Y,avionicsmonitoring,g,5,1980,1,n,n,n,n,n,l,l,h,h,h,n,h,n,l,n,40,150
65,Y,avionicsmonitoring,g,5,1979,3,h,n,h,h,n,l,l,h,h,h,n,h,n,n,n,137,636
66,Y,avionicsmonitoring,g,5,1977,3,h,n,h,h,n,h,l,h,h,h,n,h,n,vl,n,150,882
67,Y,avionicsmonitoring,g,5,1976,3,vh,n,h,h,n,l,l,h,h,h,n,h,n,n,n,339,444
68,Y,avionicsmonitoring,g,5,1983,1,l,h,l,n,n,h,l,h,h,h,n,h,n,l,n,240,192
69,Y,avionicsmonitoring,g,5,1978,2,h,n,h,n,vh,l,n,h,h,h,h,h,l,l,l,144,576
70,Y,avionicsmonitoring,g,5,1979,2,n,l,n,n,vh,l,n,h,h,h,h,h,l,l,l,151,432
71,Y,avionicsmonitoring,g,5,1979,2,n,l,h,n,vh,l,n,h,h,h,h,h,l,l,l,34,72
72,Y,avionicsmonitoring,g,5,1979,2,n,n,h,n,vh,l,n,h,h,h,h,h,l,l,l,98,300
73,Y,avionicsmonitoring,g,5,1979,2,n,n,h,n,vh,l,n,h,h,h,h,h,l,l,l,85,300
74,Y,avionicsmonitoring,g,5,1982,2,n,l,n,n,vh,l,n,h,h,h,h,h,l,l,l,20,240
75,Y,avionicsmonitoring,g,5,1978,2,n,l,n,n,vh,l,n,h,h,h,h,h,l,l,l,111,600
76,Y,avionicsmonitoring,g,5,1978,2,h,vh,h,n,vh,l,n,h,h,h,h,h,l,l,l,162,756
77,Y,avionicsmonitoring,g,5,1978,2,h,h,vh,n,vh,l,n,h,h,h,h,h,l,l,l,352,1200
78,Y,operatingsystem,g,5,1979,2,h,n,vh,n,vh,l,n,h,h,h,h,h,l,l,l,165,97
79,Y,missionplanning,g,5,1984,3,h,n,vh,h,h,l,vh,h,n,n,h,h,h,vh,h,60,409
80,Y,missionplanning,g,5,1984,3,h,n,vh,h,h,l,vh,h,n,n,h,h,h,vh,h,100,703
81,hst,Avionics,f,2,1980,3,h,vh,vh,xh,xh,h,h,n,n,n,l,l,n,n,h,32,1350
82,hst,Avionics,f,2,1980,3,h,h,h,vh,xh,h,h,h,h,h,h,h,h,n,n,53,480
84,spl,Avionics,f,3,1977,3,h,l,vh,vh,xh,l,n,vh,vh,vh,vl,vl,h,h,n,41,599
89,spl,Avionics,f,3,1977,3,h,l,vh,vh,xh,l,n,vh,vh,vh,vl,vl,h,h,n,24,430
91,Y,Avionics,f,5,1977,3,vh,h,vh,xh,xh,n,n,h,h,h,h,h,h,n,h,165,4178.2
92,Y,science,f,5,1977,3,vh,h,vh,xh,xh,n,n,h,h,h,h,h,h,n,h,65,1772.5
93,Y,Avionics,f,5,1977,3,vh,h,vh,xh,xh,n,l,h,h,h,h,h,h,n,h,70,1645.9
94,Y,Avionics,f,5,1977,3,vh,h,xh,xh,xh,n,n,h,h,h,h,h,h,n,h,50,1924.5
97,gal,Avionics,f,5,1982,3,vh,l,vh,vh,xh,l,l,h,l,n,vl,l,l,h,h,7.25,648
98,Y,Avionics,f,5,1980,3,vh,h,vh,xh,xh,n,n,h,h,h,h,h,h,n,h,233,8211
99,X,Avionics,f,2,1983,3,h,n,vh,vh,vh,h,h,n,n,n,l,l,n,n,h,16.3,480
100,X,Avionics,f,2,1983,3,h,n,vh,vh,vh,h,h,n,n,n,l,l,n,n,h,6.2,12
101,X,science,f,2,1983,3,h,n,vh,vh,vh,h,h,n,n,n,l,l,n,n,h,3,38

If you enjoyed this post, then make sure you subscribe
to our Newsletter and/or RSS Feed.


2 Responses to “The Return of the Mythical Man Month”

  1. Timothy Fries says:

    I think you really missed the point of Mythical Man Month.

    Mythical Man Month’s key takeaway is that adding people to a project makes the project slower — in other words, you can’t directly correlate man/staff months to development productivity since productivity doesn’t scale linearly with the number of bodies working on the task. (Thus, the whole idea of measuring a project in ‘man months’ is bunk because the idea of a ‘man month’ is mythical in the first place since the men aren’t working in isolation.)

    And yet, the sentence in this article where you specifically point to the Mythical Man Month, you’re doing exactly the opposite of what it advises by correlating staff months to lines of code; even going as far as stating there’s an average, as if that average actually means anything.

  2. The author responds:

    I am using the phrase “Mythical Man Month” in a more general way to indicate the hopefully by now well known tendency to underestimate the cost and schedule of software projects as well as the many difficulties in planning such projects. The IBM/System 360 and OS/360 that Brooks writes about is now a well known early example of the many problems frequently encountered.

    I don’t see a contradiction between my use of mathematical models like COCOMO and Brooks argument that adding software engineers to a late project can make the project take even longer. This problem can be taken as an argument for estimating the scope correctly at the start and hiring enough staff at the start since attempting to recover at a later date by hiring more staff later may fail.

    I am careful to emphasize in both the text and illustrations the tremendous variation in how long software projects take as a function of lines of code (or other measures).

    Models like COCOMO and simple averages of number of lines of code per staff month are only useful, in my opinion, for getting a ballpark or rough order of magnitude value for the scope of a project. Since I have encountered many cases where people underestimate the scope of mathematical software projects by factors of ten (10) to one-hundred (100), I think these models and numbers are useful for avoiding these kind of gross errors.

    However, it would be a mistake to think these models or numbers can give anything like precise estimates of the duration and effort of software development projects valid to within ten or twenty percent as we might expect in other often repetitive physical activities such as building a house with a standard design.

    Sincerely,

    John

Leave a Reply