Let me start this by saying that Configuration
Management, in my terminology, means the practice of maintaining and managing
source code, builds, unit tests and releases – essentially the management of
all development and release artefacts and not something else.
Personally I don’t consider
myself a Configuration Management person, I have done the role several times, even
for teams of developers numbering more than 50, but it is not something that I
aspire to do. What I do have for CM however is a deep appreciation of just how vital
it is to the success of delivering an IT project, and in particular one that
involves offsite development, multiple releases and on-going change management.
For these reasons most people who work with me know that I evangelise the
importance of CM a lot. You can chalk that up to having a lot of experience
working on some very complex IT projects where I realised pretty quick how essential
it was to have a functioning CM practice that ensured everything which got
developed, got built, tested and deployed in a methodical, efficient and
systematic manner. But this post is not
going to be a lecture on the finer points of branching, merging, build
automation, continuous integration and so on – there’s more than enough stuff
about that on the Internet to keep anyone amused for months, instead I’m going
to have a rant around the question of why is it is sometimes still done so poorly
in the IT industry.
First up let me say that not
every client I have worked at, or company I have worked for, does it badly or
under-value’s it, but a decent amount still do. I’ve seen an alarming number of
projects being run over the years where a source code repository, a couple of
branching diagrams and maybe, just maybe, a few install scripts is as far as the
thought process goes with CM. Worse still the implementation and maintenance of
it is usually done by developers, usually a few “heroes” that know how it
should all hang together, with little consideration given to anything except to
just get the release out the door. In short it usually fails once the project
gets big. It’s the same old story, the business gets more funding or wants more
features and as such, more releases are then needed. All of a sudden the
development team realises that the solution now needs to be deployed across
multiple servers to cope with the higher demands. The architect gets consulted,
updates the architecture and thus in-turn increases the deployment complexity
to cope with the new features which all now need to be incorporated into the
builds and deployment scripts and installers. In a panic code branches are
created left, right and centre to cope with multiple releases and development
teams. Merges then get made, often in haste, over-writing code already produced
causing more delays and down the slippery slope the project goes. An even worse
scenario in this situation is when you have multi-site development teams, especially
if they are located in another country or time-zone, because the chaos spreads
to these areas as well compounding all the problems. By the end your source
code repository resembles a weeping willow tree
The trunk is in there somewhere!
Not good.
So why do these problems always
happen? Could it be the architects not catering for it in the estimates or the
finalising of the project plan? Possibly, but it’s not like architects weren’t
developers once so they would know how tight the leash needs to be to keep them
in check (if an architect was never a developer by the way then run a mile – I’ve
had the pleasure of working with these unusual breeds of people before and they
are dangerous). Any decent Architect would know that the quality of the code
that will support the solution is inorexably tied to the Development Leads
overseeing the work and the CM
Resource (or Team) managing the process of its production, compilation and
deployment – not to mention quality assurance which is a shared responsibility.
As such architects should always be seeking to understand how the Development
Team leads are managing, testing and deploying the code to develop the solution
they spent weeks/months of long hours finessing to get right. If you’re not
doing this then you need to question how serious you’re taking the work you are
doing. But if the fault cannot be traced back to the architect then where does
it lie? Experience suggests that it is usually a result of budgeting and
estimation not catering for it and typically this occurs because the Project
Managers or Business Owners (who often aren’t technical and hence don’t
understand or appreciate the critical value of it), will push back on getting
permanent CM resources. Other times it is the Dev Leads who, under pressure to
cut down their estimates, will downgrade it to a small task that a developer
can manage: “just create the branches; get a build running and that’s all we need”
has been a common saying which is really another way to say “we’ll worry about
it later, lets win/get the work first”.
To put it bluntly if you don’t do
it, on any project where more than 5 developers are working (and thus
extrapolate that out to the number of testers, analysts etc..), then you are
going to pay a heavy price for it when the project grows in size.
And now for a real-world example
to illustrate the points made so far:
A few years ago a company I was working for
was tasked to build a very large IT system. It had numerous function points,
integration, workflow, and batch processing requirements. The user interfaces
were highly complex and the design work required a large amount of modelling.
In summary it was not your average 6 month project, it was estimated to take
several years to complete, and the code was to be developed offshore involving
a team numbering nearly 70 developers at its peak. The offshore team were told
to get a CM resource on board and, not finding one readily available, hired a
senior technical lead to do the job. After a few months the Tech Lead, who had
not sat and planned the CM work effectively and thus had not mandated the
standards around branching, merging, versioning, builds and testing (as he did
not understand CM), became buried in a huge spaghetti mess of build and
deployment scripts, numerous branches and an incoherent folder structure.
Lacking adequate and enforced standards the developers began running rampant,
adding compiled assemblies into the source code system, creating numerous
folders and sub-branches, diluting the quality of code (as there were no QA
checking tools installed) and no-one was sure how it was all meant to work so
builds and environment installs took days to complete. End result: builds were
failing everywhere and the onshore team were raising concerns. Offshore decided
that the problem was thought to be resourcing and so another resource was put
onto the team. This time it was a junior developer. The junior developer, not
knowing what she was doing just made the mess worse, so another junior dev was
brought in and the problem grew worse again. At this point you could assume
that it’s easy to point the finger at the offshore team but that would not be
fair. The onshore team themselves, who were so overloaded with work, were not
constantly monitoring the CM process as they did not have the time; they just
saw the builds failing. And secondly, the resources tasked to do the CM work
were not experienced, they were just told to get it done. In other words:
Management failure. Eventually onshore resources were sent offshore to get the
project back into shape and then the penny dropped when they arrived and saw
the development process in action. I got brought onto the project, initially to
do an architectural assessment but then I started noticing the CM issues as well, got called into a few
meetings and then it was determined I should try and help sort it out. My first
step was to immediately draft a set of standards that would become the bible
for managing the CM work. This was distributed across all the development teams
and it was made clear that adherence to these standards would now apply and
anyone caught breaking the rules would have their work removed and potentially
themselves off the project for repeat offenders. I then halted all code
check-in work, cleaned the source code repository (freeing up GB’s of
disk-space) and set about wholesale restricting of the builds and scripts. 8
months later and a trip overseas to help the team institute the standards and
we had a functioning CM practice. It
would be fair to say that CM was estimated for a lot better in future projects
after this.
So resourcing is for CM is harder
than some people realise
Configuration Management is not easy;
finding good resources therefore can be very hard and companies that have them,
particularly large organisations that do a lot of development work, hang onto
them tightly. An effective CM person needs to be a strong team leader, possess
good analytical skills, have an appreciation for process and be able to
document and articulate (in crystal clear clarity) the expectations and
standards that need to be met to ensure that the software is stored, built and
tested in a consistent and efficient manner. And yes they also need to be
feared by developers. I’m not a believer in using fear as an educator but in
the world of IT it does tend to be rather effective – a least in this
discipline. CM people need to know
architecture, not to be able to recite TOGAF or Zachman to the nth degree, grow
beards and wear bow-ties to work, but they should be able to understand how a solution
must hang together from a technical viewpoint. This will ensure that the
scripts to configure the OS, Databases, Web Servers and so on align with the
architecture that has been designed to ensure the solution can be deployed
properly. They also need to know technology, particularly the languages used by
the development teams, to ensure that the builds, unit tests, reports, code
management tools and all other manner of CM artefacts are being put together in
the best possible manner to ensure they function and integrate correctly.
Finally CM people must evangelise
process automation. This does not just extend to Continuously Integrated builds
and integration of reports either, Source Code repositories, build and
deployment scripts should be constantly checking to ensure that code being
checked in is not violating quality rules and allowing bad code to pollute the
body of work. There is no excuse not to do this nowadays, Source Code
Management systems are now so sophisticated they can manage an entire project
from the design, through to the development, unit testing, system testing,
system integration testing, user acceptance testing and deployments. There is
also a plethora of third party and free tools available online (big thumbs up
here for CruiseControl and NDepend) that can perform all manner of code quality
checking, both pre and post-compilation which bolt onto these products. The aim
should always be to use these as much as possible to create a water tight
Source Code Management system, with detailed reporting, to ensure that quality
is constantly being enforced and monitored with minimal human intervention.