Are Relationship Managers the Anti-Pattern of Outsourcing?

2014.01.19 [IMG_1213]

OK, that’s a bit unfair but the question is are there the correct number of them and are they in the most effective relationships within the organisation’s social network? Having recently been seeking a new role I’ve been asked “tell me about a time there was conflict” a few times and many of the examples that come to mind have involved an outsourced arrangement or two. This made me think what is fundamentally going wrong and can Social Network Analysis help? My first thought is that perhaps the organisations were not well understood in the first place and so, when the outsourcing structure was designed, the wrong number of relationship managers were put in the wrong place. An analysis of the organisation before outsourcing will reveal the structure of, and most importantly between, the proposed ‘retained’ and ‘outsourced’ groups.


Take a look at my earlier blog post for more.


Migrations: Practice Makes Perfect

2014.01.19 [IMG_1204]

Some years ago I was a technical lead for a large migration project (around one billion pieces of data). I’ve previously described the transformation structure and would like to share some further advice: practice, practice, practice! If, like my migration projects, there are a lot of complexities like in-flight direct debits and batch timing issues experience has shown that practice really pays off. What do I mean by practice? Once you have reached a point development of transformations and reconciliations is complete then the whole migration should be run against an accurate target (copies of live systems) and to the intended timing of the live migration (weekends, evening, whatever your choice). Make it as close as you can to the live environment (without actually issuing live transactions, of course… another topic) and I can more-or-less guarantee you will find some issues, but that’s the point: you don’t want any on the actual live run. How many practices will it take? I would suggest two to three but if you are migrating in stages you’ll get better at the practicing (meta practice?) so maybe just one will be enough. Good luck with your migration and simplify that landscape!

Why clean data is so important


Often data is collected as part of a process but is not essential to being able to complete the process. For such data items, unless quality is enforced in some way, it is very likely accuracy will fall below 100%. I while ago I wrote about predicting sales from quotations but what I did not explain was that the data came from two systems, one of which did not enforce adequate validation of a user input field required to join that system’s data to the other system’s data Presumably this did not matter to the business process or analysis required when the system was first put in. Much useful analysis can be performed on data that is not 100% accurate, for example looking at ratios over time however there is always going to be some doubt about the results. My examination of machine learning techniques was possible after I had ‘cleaned’ the data by removing and records that could not be matched between the two systems. My results showed only a small improvement of being able to predict a sale over that of tossing a coin to make the prediction, which did not seem particularly useful in the context being examined. However, in some application a small improvement over a 50/50 guess could be very important (e.g. share trading) and in this case even slightly inaccurate data could be giving misleading results. Because of the potential use of data, unforeseen when it was originally captured, I would advise architects to be less tolerant of poor data quality.

Dilemma of Choice

2013.12.29 [IMG_1129]

One reason I’m a fan of cloud computing, and in particular Platform as a Service (Paas), is that it restricts the design choices available when creating or integrating solutions. Of course choice is great but it can become a problem when there are too many options and no clear way to choose between them. How many times have you seen a project stuck in analysis paralysis? Enter the Architect…

Firstly, at the Enterprise level (and enterprise could refer to an entire organisation or an organisational division depending on structure and business model), there need to be some clear guidelines and technology selections. Exactly what these are will depend on what’s important to the organisation. Architecture principles translate what’s important to the organisation into guidance that can be applied to technology selection and solution designs. They will have the effect of restricting the number of choices available. Examples of restrictions include:

  • High-level architectures: Monolithic, Layered, Service Oriented, Event Driven, etc.
  • Data that must be common/shared (master data)
  • Integration patterns (point-to-point, brokered, pub-sub, etc.)
  • Availability targets, disaster recovery
  • Security standards
  • Data protection (e.g. how does the Patriot act impact ability to use cloud providers)
  • Technology vendors
  • Consultancies
  • Individual technology components like OS, Database, Middleware, Web
  • Hosting: internal, virtualised, external
  • Existing solutions/applications that will not be replaced
  • Upgrade frequency (incremental or wait until support is ending)

I’d like to re-emphasis that, at the enterprise level, principles should only reflect the most important guidance and will lead to some restrictions in the list above. There will also be commercial considerations that bring restrictions, most probably due to enterprise-wide deals with vendors like Microsoft, Oracle and IBM to volume licence a range of products.

If the Enterprise Architecture job has been done well a reasonable number of choices will already have been made but there will usually be some still to make, for example:

  • Do we make use of new features in the framework, database, server, etc.?
  • Which library do we use for feature x? This is more of an issue for open source where multiple implementations are available as opposed to frameworks like .NET
  • We could do this in compliance with the Enterprise Architecture but don’t have time and there is a non-compliant alternative
  • This is a completely new requirement, where do we start?

I would suggest that if it’s taking a long time to choose between alternatives it is either because:

  • The Enterprise-level guidance is not clear in this respect, in which case the Enterprise architect needs to be involved in order to clarify and update the existing guidanceor
  • There isn’t a great deal to choose between the alternatives and there is deliberately no directive from the Enterprise level in this area (the project team is free to choose). My suggestion is to pick one or two, and try to develop a part of the most complex or disputed functionality (a spike test). The spike test needs to be quick, no more than a day or two: if it works stick with it and resist trying every alternative. In my experience it’s better to move a project forward even if some of the choices are later found to be suboptimal.

Architects Don’t Code?

2013.12.29 [IMG_1121]

A while ago I went for to a job interview and was somewhat surprised by the interviewer getting rather over-excited about architects writing code, “I don’t EVER want to see an architect coding” was their view. “Not even to better understand a problem?” I asked, “No, NEVER”. I’m not entirely sure what the issue at that organisation was but it has made me consider what the difference between the architecture and development roles is.

Starting at the beginning (Frederick P. Brooks Jr.’s Mythical Man Month): the Architect is concerned with the Conceptual Integrity of a system/solution, i.e. that it makes sense overall regardless of how each part is implemented. The architect must also be able to suggest a way of implementing anything they specify but be able to accept any other way that meets the objective (otherwise how do you know if you’re being fleeced?).

When Brooks was writing there were far fewer layers of abstraction in IT. Today there are many more: from Conceptual Designs, Functional Specifications, High-level Languages to Machine Code, Microcode and CPU logic gates. Each of these has their own ‘Architect’ who leaves most of the implementation to the specialists at the next lower abstraction layer until you get to the transistors. For my purpose I’m considering the roles of Enterprise, Solution and Data architects who tend to be found in medium to large organisations.

So what differentiates Enterprise, Solution and Data Architects from Developers? In my opinion its ambiguity: the input and output of the architects is ambiguous. The outputs from Developers (code) are definitely not ambiguous. I have bored many mangers by repeatedly explaining that any specification written in prose (English) is going to be ambiguous (so deal with it and stop wasting time); if it were not ambiguous it could be compiled. I’ve not seen any commercially available compilers that take in a word document and output a fully functional system. The input to a developer could be unambiguous if an awful lot of time has been put into the specification but generally some degree of ambiguity remains.

In my view what makes an Architect is the ability to deal with a great deal of ambiguity, both in their inputs and their outputs (not knowing exactly how a feature is to be implemented). Developers also have to deal with ambiguity but at least one side of their work (code) is unambiguous. Should Architects code? Yes, but not production code, developers are the coding experts.

For some other views…

Unstructured data: an architect’s guide to text

2013.12.29 [IMG_0289]

I recently completed an excellent book which examines how to deal with information presented as text. It’s called Taming Text from Manning. The authors do a good job of introducing each topic and explain how a number of open source tools can be applied to the problems each topic presents. I’ve not studied the latter but the former are a great introduction.


I have summarised each topic, below:

  1. It’s hard to get an algorithm to understand text in the way humans can. Language is complex and an area of much academic study. Text is everywhere and contains plenty of potentially useful information.
  2. The first step in dealing with text is to break it down into parts and the most simplistic aim of this step is to extract individual words, however there are a number of approaches and more sophisticated ones will need to handle punctuation. The process of splitting text down is called tokenisation. Individual works may often then be put through a stemming algorithm in order to be able to equate pluralised and different tenses of the same stem. A stem might be a recognisable word but not necessarily.
  3. In order to search content it must first be indexed which will require tokenisation and stemming and maybe also stop-word removal and synonym expansion. It is also useful, for subsequent ranking, to use an index that allows the distance between words found from the search phrase to be calculated for each document searched. There are a number of algorithmic approaches for ranking results the simplest of which are based on the vector space model. Obviously ranking is an evolving area and the big internet search engines are constantly evolving it. Another refinement that can be applied to search is the key constituent of spell-checking: fuzzy matching.
  4. Fuzzy matching is another area of academic research with some established algorithms based on character overlap, edit distance and n-gram edit distance which may all be combined with prefix matching using a trie (prefix tree). The most important aspect of fuzzy matching to understand is that different algorithms will be more or less effective depending on the sort of information being matched, for example Movie Titles are best matched on Jaro-Winkler distance but Movie Actors are bet matched with a more exact algorithm given that they are used like brand names.
  5. It can be useful to be able to extract people, places and things (including monetary amounts, dates, etc.). Again there are a number of algorithms for achieving this including open source implementations from the OpenNLP  project. Machine learning can play a part provided there is plenty of tagged (training) examples available, which is especially useful where domain-specific text needs to be ‘understood’.
  6. Given a large set of documents often there is a requirement to group similar documents. This is a process called clustering and can be observed in operation on news amalgamation sites. Note that clustering does not assign meaning to each cluster. There are a number of established algorithms, many of which are shared with other clustering problems. Given the large volumes and algorithm complexity a real-world clustering task is quite likely to want to make use of parallel processing and this is what the Carrot and Apache Mahout projects provide by building on top of Apache Hadoop.
  7. Another activity for sets of documents is classification which is similar to clustering but starts with sets of documents that have been assigned to a pre-determined category by a human or other mechanism, for example asking users to tag articles. Example classification tasks are sentiment analysis or rating reviews as positive or negative. Of course there are a number of algorithms and implementations to choose from with each having trade-offs in accuracy and performance.


Successful Outsourcing

2013.11.30 [IMG_1043]

Many architects will be familiar with the failings of outsourcing and some will be familiar with a few successes. I have been able to observe two very similar organisations outsourcing to the same provider. One of these I would regard as a successful outsourcing and the other, I would say, is not. The two organisations, and the areas of business outsourced, are illustrated below:


To explain the diagram a little: ‘Product Administration’ in this instance refers to a financial services product like a pension or ISA; ‘Investment Administration’ deals with the underlying investments like mutual funds or equities.

So which is the successful outsourcing? Well its organisation B: the outsourced function is well defined by a number of business-level messages and there are a number of external organisations that could run the outsourced function. All great but the key advantage over A is that B is in full control of the customer experience. I would not consider A’s outsourcing to be successful because of the loss of control over the interaction with the customer. To change the customers experience A needed to go back to the outsourcer but, no surprise here, the outsourcer was busy chasing new business and A is somewhat stuck.

In their book Enterprise Architecture as Strategy Ross, Weil and Robertson refer to A’s outsourcing model as “Strategic Partnership” with a success rate of 50% and B’s outsourcing model as “Transaction” with a success rate of 90%. That is not to say one is better than the other but that is carries more risk, which given the prize might be worth taking. Architects need to recognise the risks in order to assign resources and mitigate as many risks as possible.


Enterprise Architecture Smells

2013.11.10 [IMG_0992]

I recently had a conversation about Enterprise Architecture which went something along the lines of “how would you approach EA, if you come into a new organisation and there is nothing, no EA, no IT strategy, no documentation or other guidance”. Not having personally experienced this, or thought about this scenario, I was slightly stuck and replied along the lines of  (1) understand business strategy and operating model, (2) high level documentation and assessment of current situation and identification of gaps (3) look to resolve gaps firstly through shaping any existing IT programmes… Back came a reply, “I disagree…”, followed by a logical and sensible explanation of how that individual had begun to bring in EA. In that explanation there was a great deal of context which made me realise that in all EA work I have ever approached there always has been, and it is this that will guide you in where to start. I think the contexts where EA work is requested, or identified as needed, are well summarised in the book Enterprise Architecture As Strategy [Ross, Weil & Robertson], they call them Symptoms but I like to think of them as bad Smells:

One Customer Question Elicits Different Answers: Most probably data duplication. Start with some high-level data modelling; a quick win, to build credibility, is to eliminate just one piece of data duplication; for the longer term identify all the core data that should be shared in order to drive programs of work to rationalise the estate and, of course, ensure good ongoing governance of data.

New Regulations Require Major Effort: In my experience the root cause is Different Business Processes and Systems Complete the Same Activity (see below)

IT Is Consistently a Bottleneck: I think this is tricky but my first suspicion is there is too much re-inventing (of systems and methodologies) going on and I would agree with Ross et al that the long-term approach is to introduce standardisation. Standardisation can be applied across methodologies, technologies and ultimately in the creation of generic solutions, which can be quickly re-used. It’s difficult to pick out a specific quick win but I would look to find something that can be re-used to get work underway faster than previously experienced. Adoption of SaaS and/or PaaS could be a fast-track mechanism for standardisation but there are many pros and cons to consider.

Different Business Processes and Systems Complete the Same Activity: I would start with a business capability model to understand the extent of the problem; a quick win here is not necessarily easy, it’s simple to say eliminate a duplicate system but hard to do; longer-term develop a plan that moves capabilities delivered by IT systems to more closely match the business capability model, in that each capability is implemented in, ideally only one, but practically as few as possible, systems. Of course governance needs to review proposals against the existing map of systems vs. business capability delivered to stop problems compounding or reoccurring.

Information for Making Decisions Is Not Available: more precisely it is not available at the right time. My starting point here is to examine the flows of data, is it being held up somewhere for example in overnight or weekly batches or in waiting for external data? A quick win should be straightforward using improved technical solutions in a targeted area and, longer term, adopting those improved solutions across the enterprise.

Employees Move Data from One System to Another: aka Swivel-Chair Integration. Again this is about data-flows but this time it’s the lack of automation. The approach is similar: first these manual flows need to be understood and then technology solutions implemented where cost-effective. I know it’s easy to say but often hard to achieve when data sits in silos: the organisation will need to change its operating model to mature its IT systems architecture.

Senior Managers Dreads Discussing IT Agenda Items and Management Doesn’t Know Whether It Gets Good Value from IT: I’ll admit I’m not sure where to start on these two. I suspect their cause is one or more of the previous bad smells, anyone care to enlighten me?

The above responses are only where you could start, EA should extend to all the disciplines mentioned on a prioritised basis: there will probably be more than one smell but which is worst?

Maybe very forward-thinking organisation recognise they need EA before they smell something bad and perhaps I’ll be lucky enough to work with one sometime!

Observations from Big Data Analytics #wmbda

2013.01.01 [IMG_0062]

Another interesting bda from Whitehall Media. Lots of detailed information but a couple of overall observations:

Whilst it has been going on for a while it is significant how many companies formally associated mostly with hardware are moving into software and consulting: both Dell and HP talked about their big data achievements with almost no mention of the hardware. I think this is further evidence of how commoditised hardware and system software is becoming, probably being driven by cloud offerings. The real value to be gained from IT is shifting from being able to master the Technology to fully exploiting the Information.

In listening to talks about how to make big data succeed a couple of words jumped out at me: curiosity and discovery. A number of speakers talked about the need to find the right people from within your organisation who can form a team to drive the initiative and that no single person is going to be the key. They will need to have a combination of technical skill, domain knowledge and vision and be allowed to explore without fear of being labelled a failure because, even if analysis of some data does not give you actionable insights, you now know more about that data.


Architecture Archaeology


Often the Architect will come across an aspect of the organisation they have not needed to tackle in the past. This could be for a number of reasons, maybe they are new to the organisation, it was part of an acquisition, or they have simply never had time to become familiar with this part of the organisation and it has not featured in any recent projects. The architect now needs to perform some architecture archaeology. My advice is to look to the following sources:

  • Existing documentation: this comes with a big health warning – it is probably wrong. Not because it was wrong when written but there have probably been changes in the meantime. If, indeed, it exists at all. If it does exist that’s a great start but it may not exist in an electronic format or in your modelling tool of choice. My advice would be to bring it into your tool of choice, copying by hand if necessary, as you are probably going to need to modify it.
  • Database: if the system(s) under investigation has a database go there next. The database is probably your best bet to understand the data entities involved and the relationships between them. Some databases will include referential integrity which is the best bet for understanding relationships but if there is none you’ll need to make some deductions based on field names. Indexes may also be a clue, as they are often created in order to tune performance for joins. Stored procedures could be useful, I’ll discuss those under ‘Code’
  • Users: will be able to demonstrate how they use the system(s) revealing its functional purpose, the workflows involved and participating data entities. But be warned, each individual user may not exercise all the functionality in the system so take a look for yourself at menus to see if all options and sub-options have been exercised. Even then there is no guarantee everything is covered as some systems reveal functionality dynamically based on user permissions and/or the content of records being processed.
  • Developers: if the original developers are still around you’re in luck
    as you should be able to work with them to get a pretty good understanding of
    the system(s). My experience is that they don’t remember to tell you everything
    so it’s worth comparing this with at least one other source and especially if
    the original developers are all gone and its being maintained by others.
  • Code: code is the ultimate authority in describing a system. I’ve bored many colleagues by repeatedly explaining that if their specification is unambiguous it can be complied. Code is the only unambiguous explanation of what a piece of software does. Now there are many coding styles and some are easier to reverse-engineer into an architectural artefact than others. If you are an Architect who does not have a development background it could be quite a tough job so get some help. Amongst the easiest style to deal with pushes virtually all the business rules logic into stored procedures which make it quite a simple exercise of looking through each stored procedure. The worst are the balls of mud which will contain multiple styles, middleware and/or languages, don’t underestimate how big a job decoding these will be.  There are tools which will help model and document the code but these will give structure rather than function and their usefulness varies greatly depending on the structure (or lack of).
  • Logs/Monitoring: can be really useful to understand the dependencies between this system and others. For example a Service Oriented Architecture relying on Web Services can easily be mapped by examining the web server log files.

By combining the results from all the above it is possible to get a complete picture however you might only need to understand one aspect, like data, so pick those that make sense.