Tuesday, 6 February 2018

We should design A&E statistics to help improve the system not to make it look better than it really is

The desire to make A&E performance look good is in conflict with the need to improve it. The idea that we should report system performance (as David Nicholson argues) is a mistake. We should understand how the system actually works and measure the numbers that most help focus improvement effort.

The problem with the statistics on A&E performance goes far deeper than worries about fiddling the data to get better reported numbers. The real problem is the conflict between the political need to report the best performance and the management need to know what is happening so they can improve performance. The two are in serious conflict in a way the system has never fully acknowledged.

The political desire is report something, preferably something good. The management need is to report something useful for directing management attention to actions that will lead to improvement.

The conflict arises because essentially all the performance problems are in major A&E departments so reporting a combined number (including minor units) distracts attention from the units where performance is worst. For management purposes we might as well ignore all the stand-alone WICs and MIUs as they almost never see waits longer than 4hr. We should report their performance, but we should report it separately to avoid confusion about where the problem is.

The political argument is that emergency care is a system and we should report the overall performance of the system. But this is based on a fallacy. Originally, MIUs and WICs were intended to relieve "pressure" on major A&Es who would, as a result, show better performance. This failed for two reasons: one was that there was no evidence that they diverted a notable number of patients away from major A&Es; the other insight (only available after better data was collected from everyone) is that, even if they had succeeded in diverting patients from major departments, this would not have improved performance in the major A&Es (the divert-able patients are not the ones with long waits even in major A&Es).

What this means is that there are really two systems out there not one: major A&Es (type 1 departments) and units handling minor conditions (type 3s). They don't interact enough to be considered a single system. So they shouldn't be measured as a single system because that just creates confusion about where the problems are.

(There is a slight grey area where a type 3 unit is closely associated with a major A&E. In many of these all the minors are diverted at the front door to the type 3 unit. But this is exactly how a well functioning type 1 A&E should behave internally anyway (minors should be streamed to a simple and fast process) so it seems a little odd to separate the two for reporting purposes. This only applies when the units are co-located.)

The big conflict comes when hospitals are allowed to count remote WICs and MIUs in their headline performance. The only benefit of doing this is political: it makes the overall number look better. It causes immense confusion and inconsistency in headline performance, not least because some hospitals can do it and others can't making any performance comparisons unfair. Worse, some hospitals have wasted large amounts of management time trying to grab control of minor units so their headline numbers improve, a change which makes precisely no difference to any patient's experience but wastes a lot of management resource in pursuit of window-dressing not actual improvement.

It is unclear what the recent changes mean to all this. In the current reports it is normally possible to extract pure type 1 performance if you want to see it (though it isn't the headline number reported). If the new rules allow less transparent reporting of numbers from minor units they are clearly obfuscatory and bad for the system.

David Nicholson (the former head honcho of NHSE) doesn't agree. He claimed in this tweet that the 95% target was a system target, was was correctly set at 95% and would have been set much lower if it was intended to cover type 1 performance only. But he needs a history lesson. The original target was set at 98% and was only reduced to 95% when the incoming coalition government realised that abolishing the target completely would be a catastrophe. And there was no convincing clinical argument for the change (the original 98% was to allow for a small number of clinical exceptions–concussions, for example–where patients needed to be observed for more than 4hr before discharge). Yes, some medics thought the target was too strict, but they had no evidence-based clinical reasons for this: they believed, naively, that a more relaxed target would be "easier" to meet. They were wrong: before the target was changed the system was exceeding it; as soon as it was changed performance declined rapidly and frequently failed even to meet the new–supposedly easier–target.

It is also worth noting how the system performed when the original target was set. I extracted the chart below from my archives (I provided the internal weekly reports of national performance to the A&E team for several years after the target was first set: the chart comes from a public presentation I did; sadly I don't have access to the detailed data any more).

The system as a whole eventually met the 98% target for several years. And the type 1 performance often met the 98% until 2010. Moreover, the dilution effect of including the non-type 1 units was small when the type 1 performance was good. It is also worth noting that the decision to publish the system performance (including non-major A&Es) was challenged by analysts who knew that the focus needed to be on major A&E departments. The politically-minded civil servants disagreed and decided to publish the system performance but compromised by including the type 1 performance separately in internal reports to support more focussed improvement interventions.

While consistency is a good thing to aim for (this is the excuse for the changing rules) we seem to be aiming for a fake consistency. WICs and MIUS are not evenly distributed so consistent aggregation of numbers creates an inconsistent and unfair view of actual major A&E performance in different hospitals. By far the easier way to achieve real consistency would be to focus on major A&Es only and ignore the rest (or report their performance separately). That would be consistent and fair. But performance would look much worse.

The same is true for other, related, statistics on A&E performance. England also publishes a 12hr wait statistic, but this is incredibly misleading because it isn't an end-to-end metric. The clock starts when the hospital makes a decision to admit the patient, which many don't make until they know a bed is free meaning that the patient may already have waited 12hr before the clock even starts. Wales and Scotland start the clock when the patient arrives, which isn't gameable. This has confused the Prime Minister who incorrectly compared Welsh and English numbers during PMQs, drawing criticism from the chair of the UK Statistics authority, not least because there are about 100 times more end-to-end 12hr waits in England than the published number quoted by the PM suggested (as I explained here).

Real 12hr waits are actually measured in England just not published. NHS Digital now release them (but infrequently) and few NHS managers seem to know they exist or use them despite the important additional information they provide supplementing the reported 4hr performance. They are very embarrassing as they highlight just how badly A&E performance has deteriorated in the last 8 years. So embarrassing that I was once fired for impetuously talking about them in public (I shouldn't have done but I was naively trying to improve the quality of the debate about A&E performance using actual data).

To summarise. The real problem isn't fiddling the numbers: it is choosing the wrong numbers to look at in the first place. The need to improve performance is in conflict with the political need to report the best performance possible. We should report the numbers most useful for driving improvement (type 1 4hr and 12hr performance) not the ones that confuse us about where the problems are.

PS when private sector firms do this with their performance numbers it usually ends up in disaster. Enron and Carillion, for example, used clever ruses to make their public headline financials look better than their real underlying performance. In the end their managements fooled themselves. Eventually, the real situation became apparent, but far too late for anyone act to avert disaster.

Thursday, 25 January 2018

The week in bullshit, continued…

Last week I fired off a rant about how politicians misuse statistics to win arguments with little regard for the relevance or context of the numbers thereby doing extreme violence to honesty and truth telling. This week I find yet more examples from the NHS.

Here is why the specific instances constitute bullshit.

The Prime Minister said the following in Prime Minister's Questions on January 24 in response to a Jeremy Corbyn complaint about poor performance in the English NHS over winter:

If he wants to talk about figures and about targets being missed, yes, the latest figures show that, in England, 497 people were waiting more than 12 hours, but the latest figures also show that, under the Labour Government in Wales, 3,741 people were waiting more than 12 hours.

Corbyn didn't spot the problem (I presume neither he nor his advisors are any more knowledgeable about these statistics than the PM or her health minister).

But it is quite a simple issue: the Welsh NHS counts something different when it measures 12hr A&E waits than the English NHS. In Wales, the clock starts when the patient arrives; in England the clock starts when a decision is made to admit the patient to a bed. This decision is highly gameable and is highly gamed. Even without any explicit fiddling of the numbers (there are rumours that some management teams stop their teams recording a time in a timely way to reduce their reported numbers) the decision is often postponed until the hospital knows there is a bed available. This may happen after the patient has already waited 12 hours in A&E.

The English "trolley wait" metric is a terrible, useless and misleading metric. It actively distracts from a good understanding of the problem of long A&E waits. Yet here we have a politician using it to win an argument with the opposition instead of trying to understand what is going on in A&E.

Here is some help to put it in context. The comparable number is accessible from data collected from hospitals (it is trivial to calculate true 12hr waits from A&E HES data, it just isn't routinely done and won't yet be available for this winter as national HES takes a few months to compile). In january 2017 there were 46,413 true 12hr waits in English A&Es (these figures were release by NHS Digital after an FOI request). That is the comparable number May should have quoted (if she had an up to date version of it which we have no reason at all to assume would be better in January 2018). If anyone in the system cared to have reliable and useful numbers to tell them how A&E was performing, they could easily collect these numbers on the same basis as Wales enabling them to have a much better and ungameable insight into what is really happening. Guess why they don't do that.

The disease, unfortunately, runs deep. Here are some extracts from Pauline Phillip's report on winter pressures to the NHS Improvement board on january 24: information for January suggests an improvement and the system is performing better than at the same point last year… 
[compare to this a few paragraphs later]
...A&E performance for December was 85.1%. This is 3.8ppt below the previous month (88.9%) and 1.0ppt lower than the same time last year...

...Performance is impacted by higher bed occupancy than last year and increases in attendances and emergency admissions...

[a few paragraphs later]
...Type 1 attendance growth compared to the same month last year is 1.0%.

[a 1% increase in attendance is actually below the long term trend in increases, though, to be fair, admissions were up by a much larger amount and they matter more. OTOH the excuse that "performance is impacted by...increases in attendances" is not the most accurate way to report the situation]

...the trend for much lower long trolley waits continues; 12 hour waits are 10% lower compared to the same time last year.

The problem here is that the text is frequently misleading when compared to the numbers quoted (always a danger when people are allowed to write paragraphs of bullshit instead of showing the clearest analysis of the key data points in context). And Philip, who should know better having run one of the hospitals with an outstanding A&E department) goes on to use the same trolley wait statistic that the PM quoted in her answer to make a claim that things are improving. It is such an unreliable statistic it tells us no such thing.

If you are going to manage A&E better you need to use numbers that are reliable indicators of what is really going on. Not terribly misleading numbers that are both a poor, misleading metric and utterly gameable. The trolley wait metric should have been burned years ago.

Maybe this point is simply not understood and the politicians and NHS leaders just don't get that this statistic is bullshit. Maybe this interpretation lowers their culpability for promulgating bullshit, but it is hardly comforting that the people in charge of improvement don't seem to possess the basic knowledge that any competent analyst of A&E statistics has known for years.

The reason why England doesn't routinely release reliable numbers about long waits in A&E is that they are very embarrassing. If they were widely used by NHS Improvement, as they should be, to understand what was really happening so their efforts could be focussed on generating real improvements, there would be a lot of bad headlines (which might be worth it if it helped but led to actual improvement). 

Sadly, in politics and organisations dominated by political management, improvement isn't the point: good headlines are all that matters. The impact of political bullshit is pervasive and corrosive.

PS. I'm not the only one who noticed. Faye Kirkland posted this on twitter just after I completed the original version of the blog. It is a letter to the Chair of the UK Statistics Authority pointing out just how misleading her comparison was. It will be interesting to see how he reacts.

Tuesday, 16 January 2018

Political bullshit with numbers is making it ever harder to make good decisions

If governments want to make good decisions they have to have reliable data about what is happening. But they increasingly don't use numbers that way. Instead of using data for insight they use it for bullshit and undermine the evidence they need to make a difference to anything.

So the NHS is having a winter crisis. This year instead of the service responding in a panic when the unpredictable event of winter occured, the panic response was, apparently, planned. Apparently, this is good, our lords and masters said so.

But there is a little vignette that occured in Parliament that illustrates a great deal about why we have such problems and even more about the reasons why we currently look like a kakistocracy. It relates to the statistics about bed occupancy in hospitals and illustrates something profoundly disturbing about how politicians handle statistics and use numbers.

The background to the story is that the government has now mandated daily statistics about "winter pressures" in the NHS. That might not be a bad thing in itself if the point were to make management decisions in response to the numbers (though this supposes an ability to know what response to make and to interpret the numbers correctly: neither are obviously true).

One of those statistics is bed occupancy. This isn't a very useful statistic (as I've argued before) but collecting it daily is much better than weekly or monthly which is what is done for the rest of the year.

The government (and many others) have set a "target" level of occupancy for beds to ensure there are enough free beds each day to cope with demand. That target says no more than 85% of beds should be occupied.

So far so good. But the annoying doctors and opposition insist we are in the middle of a crisis in bed availability and keep complaining. In response to one of those complaints and in explaining the impact of his winter plan Jeremy Hunt said this in the House of Commons:
The shadow Health Secretary told The Independent: “It is completely unacceptable that the 85% bed occupancy target…has been missed”. What was bed occupancy on Christmas eve? It was 84.2%, so this had a real impact.

To put his claim in context, here is the chart of daily occupancy to early january (from the latest data I could get from NHS England). Shading identifies complete weeks:

Which number did Jeremy Hunt repeat? The least representative number on the chart and the only day in the whole of winter where the target was met. He also ignored the longer term context. The days near christmas have the lowest occupancy of the whole year and, historically, have often been in the 60% range.

Now maybe he was just having a bad day and didn't mean to quote something so irrelevant to the current problems with beds. But another minister said this two days earlier when challenged with a similar complaint in the Lords:
The noble Baroness talked about bed occupancy. Of course, we know that high levels of bed occupancy are a concern. Bed occupancy was below the target of 85% going into this period—on Christmas Eve it was 84.2%
I think we can conclude that this number has been shared around the government as the one to quote to deflect any complaints about the state of the NHS at winter.

Sure politicians have to win debates and this will, inevitably, involve some spin. But the way this number was brought up goes beyond reasonable spin and becomes what Frankfurt would describe as bullshit:
[the bullshitter] does not reject the authority of the truth, as the liar does, and oppose himself to it. He pays no attention to it at all. By virtue of this, bullshit is a greater enemy of the truth than lies are.
What this case illustrates is a deeply troubling view about how politicians treat statistics. They do not look to them as a source of useful information they can use to make decisions. They trawl them for any number that supports the argument they are trying to make, regardless of meaning or context. In doing this they utterly devalue their use in decision making or management.

There is an alternative explanation that is slightly less pejorative: perhaps they are so statistically illiterate they don't understand the numbers or the context. Unfortunately it isn't obvious that this explanation bodes any more optimism about the how well the country is run.

I tell this story as an illustration of a very widespread and pervasive phenomenon in modern politics. It isn't just the government, they are all at it, opposition and minor parties included. There seems to be no drive to take the effort to analyse problems before deciding key policies or actions. The process now seems to be to identify some actions or policies likely to play well in newspaper headlines or with supporters. Only then, after the key decisions are made, does anyone look at the evidence and then only to wrench some number, no matter how out of context or irrelevant, that supports their view. Even when the number is rebutterd by the highest statistical authority in the land, they will often continue to quote it (as Boris has just done with the legendarily bad £350m/week goes to the EU). Truth and context are irrelevant: all that matters is winning the argument.

This is no way to run a government. We need people in government and opposition who are competent, honest and who are prepared to do the hard work of analysis before making arguments or deciding policies. If we don't get them, and get them soon, the bullshit will overwhelm our ability to make any good decisions about anything in public policy.

Saturday, 13 January 2018

The NHS isn't over-managed

The NHS needs more money. But the belief that it needs less management or administration is nonsense. It won't spend any new money well unless it improves its ability to spend that money well. That means it needs more management not less.

The FT has a well-deserved reputation for balanced and factual commentary on the big issues. So I was surprised to see this cliche repeated in an editorial on January 5: "There are too many administrators and not enough front-line medical staff."

Other commentators constantly repeat similar untrue cliches. On Radio 5's "good week bad week" on Sunday I heard someone claim "the NHS has more managers than nurses".

It isn't true. The reality is that the NHS is one of the most undermanaged organisations on the planet. Here are the numbers from the NHS staffing system.

There are ten times more nurses than managers and three times more doctors than managers.

And the number of managers has been falling. The numbers were cut by about 30% by the Lansley reforms because he believed the cliche ("more resources to the front line" which I've argued before is one of the stupidest in the debate on NHS policy). It hasn't obviously worked.

Just for reference here is the relative numbers of different types of staff compared to their levels before the coalition government took over in 2010:

What is notable is the steady and then sharp decline in manager numbers (with a subsequent slow increase as the system realised it had drastically overdone the cuts). Also notable is that consultant numbers are rising a lot but nursing number are steady (which suggests that recent complaint that medical productivity is limited not by doctor numbers but by the lack of nurses and support staff has significant support in the actual data).

When the bill was being debated and proposed a sharp cut in manager numbers, I tried to find some benchmarks for how many managers an organisation like the NHS might need. One crude one would be to compare the NHS against other organisations in the UK. Unfortunately the ONS only counts managers in the economy as whole and not by industry or sector (about 11% of the workforce are managers according to them). So I looked elsewhere (see the original BMJ letter reporting this here and the longer version here).

In the USA charities have to declare how much of their budgets are spent on three separate categories: money spent fundraising, money spent on their projects and money spent on deciding how the run the charity and allocate their spend. That last category is the one that might help us estimate the money an the organisation spends on management. It isn't a perfect proxy but it isn't bad. Charities, like the NHS, are not in the business of enriching their chief executives and they are under pressure from supporters and regulators to be frugal so as much of their spend should go on their purpose not on overheads. But frugality has to be balanced by the need to spend money well. Spending too little on good decisions is just as bad as spending too much.

Most charities spend more than then the NHS; medical charities often spend 3 time the proportion of revenue as the NHS. If the NHS were a charity they would risk investigation by regulators for a lack of management capacity.

There is one important caveat to this analysis. When I was looking for benchmarks I was focussing on the very heavy cuts to managers in commissioning (this was the focus of the Lansley cuts). CCGs and national bodies are the groups responsible for deciding how to configure services across the country or in a particular area. They are the people who have to decide whether it might be better to spend more in the community and less in hospitals (which traditionally dominate everything in the NHS). If they don't have the capacity to make good decisions, then the NHS is in trouble as it will be stuck with the way things currently are whether that is good for the population or not. The charity vs commissioning analysis is particularly stark for commissioners who now have so little management capacity it is hard to see how they get anything done by themselves (this, perhaps, explains their extensive use of management consultants which is often complained about by people who don't seem to understand the lack of management capacity that drives it).

But the lack of management in hospitals is also a problem. Manager jobs there should be to design effective systems, to coordinate the work of front-line staff and to do the analysis that drives and sustains improvement. This should lower the burden of paperwork and admin on doctors and nurses. If there are too few managers doing the right things then bad and inefficient processes will persist, lowering the quality and productivity of all the work done by front line staff. Improvement won't happen. And the doctors and nurses will spend too much of their time on administration instead of treating patients. IT is very obvious from the overall staffing numbers that hospitals, not just commissioners, have far too few managers.

This should be more obvious than it seems. The big NHS problems are problems of coordination and operational effectiveness. The NHS has a big issue with knowing where to spend money to make the whole system better and struggles to consistently improve or to spread best practices quickly. These are managerial in any organisation and management failure or lack of capacity makes extra spending, even when it arrives, a lot less effective than it should be. If you just spend more without knowing where the big problems are, you may well not improve the problems at all. This is abundantly illustrated by the persistent failure to analyse the real reasons for the decline in A&E performance (see my analysis).

Weak management also leaves the service incapable of resisting stupid ideas coming from the political centre. For example, Jeremy Hunt's proposal to put GPs at the front door of all A&Es is an idea that any competent analyst or manager would resist because it couldn't possibly work. Lack of management capacity leaves front-line nurses and doctors working with badly designed processes and many end up spending far too much time on administration when they should be treating patients.

There is plenty of evidence that the NHS needs more money. But even if the extra money arrives it will yield far fewer improvements than it should if the people spending it are short of management capacity. It is time to kill they myth that the NHS is overmanaged. In fact a lack of management capacity is one of its biggest problems.

Thursday, 13 July 2017

Political management of operational performance is a catastrophe for NHS improvement

You can't take politics out of the NHS: politicians have to set the budget and raise to taxes to pay for it. But that doesn't mean they should be allowed to interfere in the details of operational management. They don't understand how the NHS works and so, when they do interfere they tend to favour appearance over substance and this throttles efforts to achieve real improvement. But Daily Mail headlines are a lot less important than the wellbeing and lives of NHS patients. This is a huge problem for both sides of the debate on the NHS.

Current political thinking about NHS problems is a catastrophe

A recent story in the HSJ was a perfect illustration of why political thinking is a curse on the NHS. The story was apparently about procurement rules, though this is misleading. The issue was that a hospital wanted to take over a bunch of MIUs and WICs. The reason was that this would enable the hospital to include their (good) performance numbers in the organisations headline performance, diluting the impact of their (awful) A&E performance.

This is roughly the equivalent of Marks and Spencer buying a profitable oil exploration company to disguise the weak performance of its retail business.

Amazingly the focus of the story was on the implication for procurement rules which should prohibit this sort of transaction for competition reasons. Lets face it, only a few diehards care much about those rules (the rest of us probably should care more but it just doesn't seem like the biggest issue for the NHS right now).

The real story here is far more significance for the NHS and how it is run. It is about how political thinking has deeply permeated NHS management and corrupted it. I want to explore the implications of that.

Let's be clear about what was proposed by the trust. They wanted to waste a ton of scarce management effort on a plan to adjust the organisation structure in their local area that would result in absolutely no benefits at all for patients. Their motivation was to be able to report better results to the NHS. Note the key fact again: the rules would let them report better performance despite the fact that absolutely nothing would change for any patient anywhere in their area. As far as I can tell nobody even thought it necessary to disguise this goal. It is as if a doctor choose to treat a melanoma by offering the patient strong cosmetics to cover up the skin blemishes. All the management effort was going to game the system not to improve it.

Some might argue that management were forced into this position by the way the rules work. After all, better reported performance might result in more money for the hospital and that has to be good, right? Bollocks. If the system rewards gaming rather than real improvement the allocation of money might as well be by lottery. On second thoughts a lottery would at least be random and therefore in some sense "fair"; allocation of money to the best bullshitters is actively harmful and destructive.

It isn't a new issue for A&E performance. It was a problem in the early 2000s when the target was first set. Then as now performance was measured across a number of heterogeneous types of unit: major A&Es (type 1 departments) are open 24hrs and handle almost anything; but there are both specialist units and a mix of Walk In Centres (WICs) and Minor Injury Units (MIUs) that only handle minor injuries and don't open 24hrs. The problem is that almost all of the bad performance is in the major A&Es. I can't remember the last time a non-major unit breached the A&E target. So, if you want to know where the problems are you need to look at the type 1 performance. Diluting this focus by allowing hospitals to quote the overall results by including other units makes it harder to see where the real problems are. The top civil servants in the A&E team originally wanted to report only the combined numbers (unsurprisingly they are always better and enable the minister to claim better performance). I argued that this was a mistake and would result in management effort being misdirected. The improvement team needed to focus as clearly as possible on the places that actually had problems. The compromise was to report both numbers internally (but let the minister claim the higher numbers in public).

The story is the epitome of what goes wrong in NHS management when political thinking pushes out good operational practice.

The root of the problem is the way politicians think. They focus on what looks good in newspaper headlines today rather than what will work tomorrow. And few, if any, take the time to explore the root cause of the visible problems in the NHS.

Perhaps we should not be too hard on them as they have to win arguments. And winning argument requires persuasive rhetoric. But the facts must come before the rhetoric. Win the right argument. In reality rhetoric is often all there is and the facts are warped to fit. This doesn't work. Reality is not influenced by propaganda. Pretending things are better than they are disables the key flows of information that tells you how to improve.

It is a problem because the political way of thinking becomes deeply embedded into the way management decisions are made inside the NHS. And when you understand the political mindset you can see that this is deeply corrupting. Good management monitors and measures the things that matter for understanding and improving performance. Political management is only interested in things that can make performance look good.

The political mindset doesn't just corrupt what gets measured; it also drives plans and actions that bear little relationship to reality. So the centre demands that trusts produce "improvement trajectories" that bear about as much relationship to reality as the athletic sex in a porn movie bears to real bedroom behaviour. There is little or no incentive for an honest appraisal of the root causes of problems which might create a realistic chance for improvement but might take far longer than the planning and reporting horizon.

The political mindset is reinforced when politicians directly interfere in operational decisions. The interventions usually seem to be as unrelated to operational reality as the reported numbers. Again, the focus seems to be how to get catchy headlines that proclaim "something is being done". Jeremy Hunt's desire to put lots of GPs at the door of A&E departments sounds good if you know nothing about how A&Es work or where their problems are; if you know what A&Es are really like it looks like the policy equivalent of you taking an antibiotic to cure a viral infection in your dog.

Running the system doesn't mean interfering in the operational details

It is a cliche that we should take all politics out of NHS management. But we can't. The budget and the taxes to pay for it have to be determined by politicians: that's how UK democracy works. To think otherwise is a technocratic fantasy (often perpetuated by wannabe technocrats whose understanding of the sources of NHS problems is no better developed than that of the politicians).

The problem I'm pointing out isn't that: it is the problem of interference in operational issues. Just because politicians set the overall budget and direction doesn't mean they should meddle with how hip replacements are done, how GPs organise themselves to respond to patient needs or how flow in A&E departments is managed.

The problem of political interference isn't helped by the way opposition politicians treat problems in the NHS. They too lack any useful operational insights into the real causes of problems. But they reinforce the idea that the government should be held responsible for the details of how the NHS is run. This would not work even if the leadership at the top had any useful insights into the operational reality of how the NHS actually worked: the organisation is just too big for any central body to know the details of why Mrs Stevens waited 12hr on a trolley in Cornwall's A&E department.

The people who could know those details are the managers and medics in the local hospital or GP practice. They have the capability to understand what is happening in their local operations and to identify what isn't working properly; they should have the accountability for fixing things that don't work.

But the very people who should be able to understand problems and fix them are undermined by the process of political management. The message from the top is that it is more important to make things look good than to actually fix them; the metrics required to measure whether things are working or not are corrupted by the central need to report good news; long term improvement is useless in meeting the short term need for good headlines. Worse, the constant focus on cliches like "more resources to the front line" demonises managers and leads to policy where we might well get more medics (more doctors looks good in headlines) but those medics can't be productive because of a lack of support staff. The lobbyists for doctors and nurses collude in this by constantly arguing that the problem is a lack of whatever staff group they represent ignoring the strong evidence that their members can't be productive in a badly designed system where they are poorly coordinated and don't have enough support staff (a recent report by the Royal College of Surgeons is a notable exception).

It is also worth noting that hospitals have very few managers (on sensible definitions only 2-3% of staff are managers). Most hospitals are have fewer than they really need. Moreover, if we look at the differences between hospitals, the ones with more managers get better results for patients and have better financial outcomes.

Managers and clinical leaders roles in driving local improvement are also strongly undermined by constant burdens from the centre. The NHS isn't content to look at transparent and well designed measures of outcomes and judge local management on whether they achieve them: instead they interfere on the most detailed levels as though the centre understands how the local job should be done (they don't). This results in a demand for local units to report a mountain of badly designed metrics and to follow detailed central guidance on how they work when local experimentation and improvement would be far more effective. Nigel Edwards once described this as:

"A significant organisational pathology"

Also observing:

"Time that should be spent dealing with problems is diverted to reporting on the actions being taken and providing reassurance that previous action plans have been executed."

When bad managers are combined with political management we get the worst of both worlds. Instead of focussing on the root causes of performance problems, the managers now focus on how to game the targets. Hence catastrophic waiting list behaviours and a dangerous spike in apparent activity in A&E departments in the minutes before patients have waited 4hr. This sort of behaviour is sometimes blamed on "targets". But not all targets are bad and the real blame lies with bad managers working in a system that rewards gaming rather than real improvement.

How could we do things differently?

Political management results in poor choices of what is measured, poor choices about what is done and a persistent inability to achieve substantive improvement. It isn't an easy problem to fix but there are a lot of things than can be done even if politicians don't change their spots.

Opposition politicians should stop behaving like the government and do some homework before proposing alternatives to government policy. Understanding what is broken in the NHS is a harder job that proposing the opposite of what the government are doing, but the reward might be to improve the quality of debate.

We could do with more transparency in how we measure performance. Both the design of performance metrics and their dissemination needs to be more independent of government. Manipulation of the metrics to lower the number of bad news stories should not be allowed (see this on how reporting of A&E performance was corrupted). Other government statistics are moving this way, so should everything in health.

If we want to generate some real improvement in the NHS the key is for local organisations to do their own thing. The central metrics don't tell you what you need to do to improve. Ignore them and measure and report what helps you gain insight into performance and quality. Focus on real improvement and the government mandated headlines will follow.

One final plea to fellow commentators: do your homework. If you comment on how messed up the NHS is but do using the same superficial cliches as the government, you are part of the problem not the solution. Current policy is often based on a search for good headlines uninformed by actual analysis of the real problems. Unless those who oppose as well as those who govern learn to break from the habit of seeking headlines rather than solutions, policy won't improve.

This is very important: good Daily Mail headlines are no substitute for the wellbeing and lives of NHS patients. Unless everyone involved in the debate (journalists, commentators, lobbyists and politicians of both sides) up their game then patients will suffer and the NHS won't improve.

Wednesday, 17 May 2017

Equal area HexBins in Tableau without mapping distortion

In a previous post I described a way of creating hexagonal areas on maps automatically making it easier to show area maps using Tableau's HexBin functions. The original method creates automatic hexagons in LatLon coordinates but they become increasing elongated when plotted on the standard Mercator projections used in Tableau maps. It turns out there is a way to adjust for this that creates better-looking regular hexagons on maps (and also also make their actual geographic size more consistent). This post describes both ways of creating hexagonal regions from HexBins in Tableau.
Note, I have also posted the essence of this in
The hexbin function in Tableau works on any X-Y data to bin the raw coordinates into hexagonal areas.
On maps, the data usually comes as Lat-Lon coordinates expressed in degrees. Tableau can plot this data directly but usually does this using the somewhat distorting Web Mercator projection (areas far away from the equator are greatly enlarged). This means that hexagonal bins defined on lat-lon will not be equally sized (one degree in an east-west direction represents a much smaller distance on the Earth's surface when the latitude is high and far from the equator but a degree on a north-south line is always about 111km long).
Simple versions ignore this distortion and generate the hexbins from lat-lon coordinates. I'll describe methods based on this simple technique first. I'm basing this on a dataset containing the locations of every postcode in the UK (~2.5m rows with a wide range of density in different geographies).
The simple method (produces distorted hexagons)
The basic method involves several steps:
  • Generate a scaling factor using a parameter that allows adjustment of the size of the hexagons
  • Generate a value to define the extra points for the vertices of the hexagon (I achieve this by adding a value to each row on the dataset that consists of a random number from 1 to 6. This works when you have enough data points to guarantee a complete hexagon can be creates for each are you want to plot.)
  • Generate the lat-lon coordinates for the hexbin centres
  • Add the 6 points that define the vertices for the hexagons
  • Plot the hexagons as polygons on a map (they can then be coloured by any aggregate based on the underlying data)
Here is a step-by-step guide.
I define a parameter hexbin size that represents the radius of the hexbin in degrees. Then the hex bins are defined in the following way based on the Latitude and Longitude values in each row of the dataset:
hexbinx formula
The hexbiny formula is similar.
The dataset itself contains a field called PointID containing values from 1 to 6. (there are multiple ways to achieve this but doing it in the dataset is easier than doing it in tableau as many databases contain hashing functions that guarantee a random distribution of integers which can be turned into numbers from 1 to 6 by a simple mod function). I define a new field called angle based on the value in PointID (to help define the hexagon's vertices):
angle formula
Now I can add the points that need to be plotted to define the hexagonal polygons:
plot lat formula
The longitude is similar:
plot lon formula
At this point the dataset should contain enough rows within each hexbin to guarantee that each hexbin contains rows with every value from 1-6 so all the points of the hexagon are defined.
To plot the hexagons, make the Geographic Role for the plot lon and plot lat fields is set appropriately and double-click each in turn. Then drag the hexbinx and hexbiny to detail and change the chart type from automatic to Polygon. This will give some horrible mess which is fixed by dragging PointID to Path. This should give this (i've also added a count of the number of rows to colour the hexagons and adjusted colours and transparency):
simple hexbin map
This shows that the basic technique works. But it also shows a key problem with it: the hexagonal polygons are distorted. If you can live with this (it is less of a problem close to the equator or over small areas) thens stick with it.
But if you care about the distortions (the visual as well as the fact that the hexagons don't cover equal areas) then you have to do something more complex. What I describe below is based on work by Sarah Battersby of Tableau.
The complex method for producing hexagons of even appearance
The adjustments below involve some extra steps over and above the simple solution:
  • Generate a new scaling factor based on distance (as opposed to LatLon degrees)
  • Convert the lat-lon values into X-Y coordinates in the Web Mercator system based on distance not lat-lon degrees
  • Generate the hexbins using distance not angle in the new coordinate system
  • Convert the distance-based hexbins back to the lat-lon coordinates to define the hexbin centres
  • Add the hexagon vertices using the distance coordinates and then convert back to lat-lon 
You also need to add a parameter for the radius of the earth which is a conversion factor in the coordinate transformations. 
Here is how that works. First the conversion to X-Y coordinates:
mercator x formula mercator y formula
Now we can use hexbin grouping on the new coordinates:
hexbin mercator formula
The formulae for converting these coordinates back to lat-lon are below (useful if you want to just plot the centres):
adjusted hexbin lat adjusted hexbin lon
Note that it is the latitude formula that is the complex one.
But to make the hexagonal polygons work properly when plotted, you have to add the extra points for the 6 vertices before transforming them back to lat-lon coordinates. The conversion is the same as the formulae above and the final formulae look like this:
plot lat adjusted plot lon adjusted
The extra part just adds a fixed distance to the coordinate for the hexbin centre based on a hexagon sized by the scaling factor (and there are six of these based on the six values of Angle).
When plotted in the same way as the previous polygon plot it should look like this (when tidied up):
adjusted polygon map
Now both the visual appearance and the actual size of the hexagons are better. 
For a Tableau Public workbook containing the above calculations see here.
PS Tableau could and should build most of this functionality into the product as it would dramatically simplify a useful technique.

PPS Although the visual appearance of the hexagons is now much better, the actual land area still depends on latitude meaning that the true area of each hexagon varies. According to Sarah Battersby's calculations areas the N/S extremes of the UK will be about 10% different from the areas in the centre. If you want the details of that calculation, check her Tableau page referenced below.

Many of the ideas are based of the work of Sarah Battersby of Tableau Research. Some of her relevant talks are also available on the Tableau conference site here.
And the original ideas were inspired by (but use slightly different techniques from) some posts by Alan Eldridge (the Last Data Bender) some of which are described on his blog here.

Tuesday, 25 April 2017

Wrangling English prescribing data

This is a technical post intended to remind me of what has to be done to make sense of English prescribing data, an amazingly useful open dataset that summarises everything dispensed in the community in England (which accounts for about £8bn or about 8% of NHS spending every year). For all its usefulness, it is a pain to do analysis with it for a variety of reasons. I hope that by describing how to make sense of it my notes will also be helpful to others. I may update this as I work further with the data.


The trouble with english prescribing data is that, although it has been openly available since 2010 (every month we can see exactly how many prescriptions of each type are dispensed in the community) it isn't easy to make sense of it.

The raw data released by NHS Digital consists mainly of a file detailing the BNF code (a 15 character code describing the exact item being prescribed, BNF is the British National Formulary), the GP or clinic code for the issuer of the prescription, and the amount of things prescribed (by book price, actual cost after discounts and charges, number of pills/volume/units). What it doesn't have is the sort of metadata that helps you group these things together into useful categories (GPs mapping to CCGs or the grouping of drugs into categories).

The BNF code does contain useful information to solve some of that problem and there are lookups for the mapping of GPs onto bigger units. But these have to be downloaded from other sites and the extra data doesn't always map neatly onto the raw data. This blog describes how I sort, wrangle and combine the sources to make something more useable partially to help me remember and also so others can see roughly what is required if they want to play with the data.

BNF codes

BNF codes actually contain a hierarchical coding that groups drugs and devices into categories. The codes group drugs into chapters, sections, paragraphs and subparagraphs. Chapters are mostly about what sort of problem the drug deals with. For example, there are chapters for drugs treating the digestive system, drugs treating infections and devices for incontinence. The other levels group these together in a fairly reasonable hierarchical classification. It works like this:
  • Characters 1-2 code the chapter
  • Characters 1-4 code the section
  • Characters 1-6 code the paragraph
  • Characters 1-7 code the subparagraph
  • Characters 1-9 code the chemical or device
  • Characters 1-11 code  the product (often essentially the same as the chemical)
  • The remaining characters code the specific item (different versions of the item, different formulations of the same drug, different strengths of pill, etc.)

There are some complications on top of this, though. Dressings and appliances only have 11 characters in their codes (though the raw data makes all codes 15 characters long by right-filling the codes with blank spaces, some metadata sources use a mix of 11 and 15-character codes just to be really annoying).

So, while we can derive a hierarchy from the raw data, we need to find a source for the names of chapters, sections, paragraphs and so on (the raw data comes with a file describing the names of each specific item's formulation). We could just buy the latest BNF (available in book form with all sorts of useful advice about drug use, safety and dosage and updated every six months). But the book is over 1,000 pages long and there are nearly 40,000 items to code so it isn't a very practical way to get the metadata unless you are both a masochist and a good typist.

Annoyingly the BNF doesn't, as far as I can tell, release an electronic form of the data (even their iOS app requires an Athens login presumably in a futile attempt to prevent non-qualified people misusing the data even though those who want to can just buy the physical book). But the BSA (the NHS Business Services Authority, who compiles the data from the raw prescriptions used to pay pharmacists), does. It is accessible as described in this useful opendata.stackexchange post. But the BSA have a deserved reputation for being annoying awkward bastards when it comes to open data. The opposed making the raw data available as opendata when Tim Kelsey originally proposed the idea. Their own tools for accessing the data are clunky, slow and primitive. And they don't fall over themselves to make anyone else's life easier. For example, their metadata is coded using both 15 character and 11-character BNF codes (unlike the raw data which pad all codes to 15-characters long). Worse, the lookups are inconsistent from year to year (the 2016 release has 35,673 BNF codes but the 2017 release has 74,653 even though the entire public dataset doesn't have that many unique codes in it). Worse still, not every code in the public data has a matching code in the metadata file.

It is unclear why there are so many more codes in the latest BSA metadata. Here is how this looks compared to the number of codes actually used in the raw volume data:

And this understates the scale of the problem as some of the codes in use don't match anything in the metadata. This table shows that problem:

This table shows that 1,277 codes in the raw data are not present in the 2017 metadata at all.

This is pretty fucking annoying for those of us who want to do analysis.

All is not lost. The correct position  in the hierarchy can be retrieved for the things actually prescribed as their codes contain some of the information about their position in the hierarchy (remember the BNF code contains the chapter, paragraph etc. codes).

So we can highlight the 1,277 missing BNF codes by doing some SQL magic in BigQuery:

Screen Shot 2017-03-28 at 18.22.45.png
And then we can reapply the known chapter, section etc. values, where they exist. Unfortunately, they don't always exist because the BNF sometimes revise the hierarchy, adding new ways to classify drugs. The biggest change recently was the introduction of a new paragraph and subparagraph classification for drugs used in substance abuse. This replaced the codes of drugs with new codes but means that the old codes won't fit correctly into the new classification. This can be fixed manually, though it is a pain in the arse to do so.

In addition, the raw data files containing metadata for chemical/product names uses a different structure for the chemical codes for the chapters consisting mostly of devices and dressings (all chapters>19). Instead of a 9-digit code the product/chemical is described by a 4-character code (ie the same as the paragraph code).

But, despite these difficulties, I persisted anyway. I extracted the codes actually used in the raw data and filled in (to the best of my ability) the missing hierarchy parts so that I could produce (nearly) complete classifications of everything actually prescribed.

Prescriber Locations

Then we have the problem of knowing who prescribed and where they are geographically. The raw data contains the prescriber code (some clinics but mostly the standard code for GP Practices). It also contains fields described as SHA and PCT.

Unfortunately the NHS has been reorganised since the data was first published and both Strategic Health Authorities (SHAs) and Primary Care Trusts (PCTs) were abolished and replaced by NHSE Area Teams (NHSats or some acronym like that) and Clinical Commissioning Groups (CCGs). The data in the SHA and CCG columns switches to new codes somewhere in the middle of the timeseries. This is sensible, but has the major disadvantage that the raw codes don't allow a consistent set of regional comparisons over time even though the actual people doing the prescribing are the mostly same (neglecting the small number of closed and new GP Practices). Another problem is that the fields contain codes that are not regional or local bodies. Some look like the codes of hospital trusts; sometime SHA codes appear in the PCT field and sometimes codes are used that don't appear to match anything I recognise.

There are two ways round this problem. One is to use the geography of postcode locations to map the locations of practices onto the area where the postcode is located. The ONS produces lookup tables showing the higher-level geographies each of the ~2.5m postcodes in the UK resides in. This won't always match the actual NHS administrative hierarchy for GPs, but is is close and we can extract the relevant NHS codes from a single lookup table, the ONSPD table which is large but manageable and downloadable from the ONS Geography portal.

The alternative is to match GPs to CCGs using the lookup tables NHSE are supposed to provide. This should be exact, but the tables are not always up to date and some codes were altered after the original release (presumably just to make life harder for analysts like me). And CCG membership and boundaries have changed several times due to GP defections, and CCG mergers/splits. Since doing consistent geographic analysis is clearly a waste of time for everyone, why should I expect this to be made easy for me?

Even if we use the NHS GP to CCG lookups, we still need the ONSPD file to identify the precise location of practices. This file, at least in recent versions, give the lat/lon of every postcode so enabling location maps with little fuss if your choice of analysis software can map using lat/lon (I use Tableau which can draw a map as soon as you give it a lat/lon).

Back to the raw data which has some further traps for the unwary. Alongside each month's prescribing data,  NHS Digital release a file containing the lookup from prescriber code to prescriber name and full address. WOOT! You might think, problem solved. Unfortunately not completely as there are some consistency problems. For each practice code there are a practice name and 4 address fields in the data. The last two address fields should contain town and area, but often don't. Sometimes the last field is town; sometimes it is blank; sometimes the town fields contains a third address. Luckily we rarely need those full addresses so we don't need to fix these inconsistencies. But we do need the postcodes and they are, at least, all in the same column of the data.

But not in a consistent format for fucks sake. There are several possible standard ways of formatting postcodes (the ONSPD file has three alternatives in case your data uses a different one). This happens because postcodes have different numbers of characters (for example, B1 7PS; BT6 1PA; SG18 2AA; SW1W 9SR). You can force fit all postcodes to 7 characters (some will now have no spaces), 8 characters (some will have multiple spaces) or a variable length which always has just one space in the middle. The raw data for prescriptions is supposed to be in the variable length format. And it often is, but not always. In some months some of the postcodes have an extra trailing space which means that any lookup to location using the ONSPD file fails. It is easy to fix (databases and spreadsheets have TRIM functions to remove redundant space) but why didn't NHS Digital apply this before releasing the damn data. It isn't as if they are even consistent every month.

This isn't the only problem with the postcodes. Some of them are wrong. It is almost impossible to tell this unless you draw maps and observe locations over time (I do which is why I spotted the problem). Some practices change postcode over time in ways that don't make sense. Sure, some practices move location to new buildings and get a new postcode. But in the data some make leaps of hundreds of miles and then move back. On investigation, some codes have been transposed from practices near them in the list in some months. This isn't common, but suggests that some analyst has been making transposition errors when putting the data together and hasn't found a reliable way to validate the whole monthly dataset.

Putting it all together

Now we have a (mostly) complete way to classify drugs we can look at the whole hierarchy. A summary is given below:

hierarchy counts to dec 2016.png

Note that the number of distinct names doesn't entirely correspond to the number of distinct codes. This is because of several factors: sometimes the name is changed over time for exactly the same product; sometimes the hierarchy is reorganised giving existing drugs a new BNF code (for the same exact product) and a variety of other reasons.

Normalising the results

Sometimes we need to know more than just the total amount prescribed. When we want to compare GP Practices, for example, we need to account for the fact that GP lists have a wide range of sizes and population demographics. A fair comparison of two GPs requires adjustments for both the size and age mix of their populations. The old pop a lot more pills than the young so any comparison needs to adjust for the age mix or it won't be fair.

The simplest normalisation is pills per head which is easily derived from gP Practice List size. A better comparison needs to use the whole demographic profile for the population on the list. Two common schemes are to use units known as ASTRO PUs and STAR PUs. ASTRO PUs take into account the age/sex mix by defining a standardised rate of prescribing by number or spend for different age/sex bands in the population (which is originally based on the English average rate though the actual analysis is only done every few years). STAR PUs are defined for specific categories of drugs where the national rate may be inappropriate.

An ASTRO PU (item or cost) is the cost or number of items that a practice would be expected to incur if it prescribed in exactly the same way as the average practice in England. Per quarter (so you need to adjust the timescale if you are working with units other than quarters). I create a table of monthly ASTRO PUs from the quarterly population data so I divide the values by 3 to get an expected monthly number of units.

But deriving the relevant factors for a Practice is not simple as list sizes change over time. And, as usual, you have to find the data, which isn't trivial. The NHS BSA have some but it only starts in 2014 (our prescribing data starts in mid 2010). NHS Digital have quarterly data from 2013 and, if you search carefully enough, some data from 2011. But no obvious single series going back to before 2013.

And that isn't the only problem with the data. Recent quarters add the useful extra information containing the ONS standard codes for NHS geographies as well as the NHSE codes. But this means the files are now in a different format and must be edited before being combined into a single source. Plus the file names are inconsistent and don't (usually) contain date information so you have to edit that in manually.

To add some anal-probing to the severe beating the poor bloody analyst has already taken, they extend the age bands in 2014 and later data to include 85-89 and 90-95 (previously ended with 85+).

In fact here is a suggestion for how NHS Digital could make everyone else's life easier: copy the Land Registry. They release monthly files of house sale transactions in a consistent format that is easy to join together if you want to create a bigger dataset covering a long time interval. And they also publish a single consolidated (and very large) file that contains all the data they have ever released so you can get a consistent single file with all the data you want with just one download. Or you can do the big download to get started and then update every month from the monthly release without worrying about consistency or manual edits.


The bottom line is that making sense of England's prescribing data is possible, but, at least for parts of the relevant data, unnecessarily hard. The difficulty of deriving joined-up datasets that can be visualised is far, far harder than it should be. It is as if there were a conspiracy to stop open data actually being useful.

NHS Digital could help. They could release the data and all the associated data in a single place where users could use it immediately. They could create repositories that release the data in a single lump rather than in monthly extracts. This wouldn't be easy as CSV files (the result would be far too big compared to the Land Registry where the accumulated file containing all house prices since the mid 1990s is only ~3GB as a single CSV. But there are people would probably do it for free if asked nicely: Google already host bigger public datasets on their BigQuery data warehouse and Exasol have assembled the core parts of the prescribing dataset on a public server for demonstration purposes (see where you can register for access to this and other interesting datasets and many examples of data visualisation based on them). This is the sort of thing that NHS Digital ought to do to encourage the use of their data. Diesseminating it isn't enough.

I should also mention that Ben Goldacre and Anna Powell-Smith have produced an online viewer ( for many aspects of the data but it doesn't (yet) have true freedom to explore any aspect of the data any way you want.

And, not forgetting, my Tableau Public summary of the national data (which leaves out GP-level data to fit into a public worksheet).