Researchers say analysis of prescribing hampered by ‘poor quality’ NHS data

  • 7 August 2018
Researchers say analysis of prescribing hampered by ‘poor quality’ NHS data

Researchers attempting to investigate a rise in prescriptions for asthma medication have suggested that ‘poor quality’ public sector data could limit the effectiveness of new technologies and health policies.

Data science firm Polymatica sought to use NHS Digital’s GP practice data to pinpoint the cause of a 17% rise in medications prescribed for asthma between 2011 and 2017.

The data shows that the amount of asthma medication prescribed last year hit 54.6m items, up from 46.5m in 2011.

But Polymatica claimed that when it tried to find potential links between the rise in prescriptions and external factors, it was unable to draw any conclusions due to the “poor quality” of NHS Digital’s information.

Mark Hinds, CEO of Polymatica, said: “We wanted to see if external factors such as socioeconomic status or pollution would affect the level of prescriptions. But the data left us questioning whether the infrastructure and processes in place for data entry and management are up to standard.

“The government is clearly willing to make changes to public health policy – but what are they basing these decisions on? You need clean data to understand the root cause of problems like rising asthma medication.”

According to Polymatica, the main issues stemmed from the fact that data was entered manually. Because of this, addresses were often entered incorrectly, contained spelling errors and used various abbreviations that made the data difficult to aggregate.

“Ultimately, poor data quality harms results and creates inconsistent insights,” said Hinds.

“The consequences for this could be sizeable – impacting policy decisions based on data analysis and limiting the effectiveness of new technologies such as artificial intelligence.”

Not designed for analytics

The dataset used for analysis consisted of more than 700m rows of information on prescriptions written in England between 2011 and 2017.

Its primary purpose is to ensure the reimbursement of pharmacists and dispensing doctors in the NHS, Chris Roebuck, chief statistician at NHS Digital, told Digital Health News.

Roebuck acknowledged that there were some “data quality issues” in the data that could “limit some secondary analysis of it.”

He said: “It is positive that open data such as this is being used for secondary purposes by third party organisations. But it is important to understand that there are times when those undertaking such secondary analysis will encounter limitations because the dataset wasn’t designed for the purpose they are seeking to use it for.”

Roebuck also argued that the dataset used by Polymatica did not contain all the information necessary to draw the conclusions it was seeking.

“Since drugs can be prescribed to treat more than one condition, it may not be possible to separate the different conditions for which a drug may have been prescribed,” said Roebuck.

“For example, [Polymatica] may have looked at medicines that can be used to treat a range of respiratory conditions and not solely asthma.”

Despite this, Hinds suggested that better quality open data could offer “a genuine opportunity for third parties to support the NHS in helping make the nation healthier”.

He told Digital Health News: “With government funding and additional support from business, good quality open data could play an enormous role in driving initiatives such as highlighting the impact of emissions on our health and helping to put policies in place.

“With the likes of the British National Formulary, the NHS has already taken a positive step to ensure good data quality and with further investment, open data could be a powerful tool in helping to proactively identify causes of illness, reducing the burden on the NHS in the process.”

Subscribe to our newsletter

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Sign up

Related News

WHO launches collaborative network for data and digital health

WHO launches collaborative network for data and digital health

WHO is bringing together its European region member states with partners for a network focused on advancing data and digital solutions in health.
University of Cambridge to drive health innovation through AI partnership

University of Cambridge to drive health innovation through AI partnership

The University of Cambridge’s Maxwell Centre and SAS have partnered to accelerate healthcare innovation through better access to advanced analytics.
Calderdale and Huddersfield awarded HIMSS stage 6 for analytics capabilities

Calderdale and Huddersfield awarded HIMSS stage 6 for analytics capabilities

Calderdale and Huddersfield NHS Foundation Trust has achieved a stage 6 validation from HIMSS for its use of data and approach to data science.

26 Comments

  • if there is any PID in the data, either qualitative or quantitative, being provided then there is breach of confidentiality which is very serious issue indeed and would need to be investigated by the appropriate organisations – an independent body should examine the data to determine wether or not it contains PID

  • Industries need to slow down as a whole and put less focus on hoarding data and more focus on WHY they need the data, HOW they are going to use it, WHERE it is coming from and finally, HOW they are going to keep it clean and protected. You can’t automate everything, humans will manually input data for the time being and we just need to step up and make things easier to manage it going forward. The cleaning part of this equation I recommending checking out Data Ladder’s DataMatch Enterprise here: http://bit.ly/2H7LO1s

  • So if my understanding is correct, there is no structured patient data in the data set (name, address etc.) but some patient data was available in the free text field and this is what Polymatica had access to and tried to analyse? If this is correct, it raises a lot of questions about the rules and processes related to data entry of free text, privacy compliance and the role of NHS Digital in not ensuring PID was available in the data set, or capable of being identified in conjunction with other external data.

    • One of the privacy questions it raises is whether NHSBSA are collecting more personal data than is strictly NECESSARY for the purpose of authorising payment to a pharmacosts for dispensing a prescription. Could anyone tell me what data is actually necessary for that purpose, and why it is necessary?

      Also, my understanding is that NHS England delegates the function of paying pharmacists to NHSBSA (under a particular Regulation). Why is the data that may be required by NHSBSA being passed on to NHS Digital? What essential part do NHS Digital play in paying NHS pharmacists for dispensing prescriptions? If none, then they should not have the data in the first place, never mind disseminate it inappropriately. GDPR Article 5(1)(b).

    • We rarely put any meaningful information into the free text box for prescriptions, other than the instructions (e.g. take one twice a day).

      • Tx Neil but who is ‘we’? As there seem to be no restrictions is it not possible that the field is used very differently by others? If correct, then it’s not clear to me where they got the patient-related data from as others here have said there are no specific patient data fields.

        • if there are any nhs employees out there please listen !
          this is the professional way to do IT:
          1. establish your requirement
          2. analyse the model to see if it can satisfy the requirment (*)
          3. analyze the data to see if there is sufficient data of sufficient quality

          if there is no model the U R in trouble !!!

        • sorry, by “we” I meant GPs (who generate the overwhelming majority of such prescriptions)

  • Why were Polymatica given patients’ full addresses (PID) in the first place?
    Then we wonder why many people don’t Trust the NHS with their data!

  • Tricky, theyare trying to do something +ive and clearly need more data, the NHS is rich with data and the data is priceless, just like blood

    • Who and what, exactly, would you say is responsible for the situation in which patients feel that they need to try to protect their health data, because nobody can be trusted?

      • maybe it’s because many can’t even see “their” data for themselves can they ? tricky sharing something when you do not even know what you are sharing …

  • Is miquest still going?

    Once we get proper FHIR working on Gp practices we should be able to move to public health queries.

    * proper fhir – ie not built by committee.

  • Were patients informed that their prescription data collected by NHSBSA (Prescriptions Service), supposedly for the purpose of paying pharmacists for dispensing the prescriptions, was being distributed by NHS Digital for other secondary purposes “not compatible with the purpose for which the data was collected”? How much of the fully identifiable data used belonged to patients who had registered a Type 2 objection, meaning it should not have been distributed by NHS Digital to anyone for any purpose? There are echoes here of the MoU between NHS Digital and the Home Office, and the Health Committee Inquiry into this, earlier this year, which unmasked NHS Digital for what it really is.

  • We don’t put socioeconomic status, pollution, or indeed the reason why we are prescribing a particular drug, on an FP10 prescription. So you’re never going to get that information from prescription data.

    • You could get a much better insight doing this kind of data analytics locally and there are tools out there to do this. As mentioned by NHSD above, they tried to do this research on data designed to process prescription payments.

      • Absolutely. Polymatica should approach GP surgeries directly and we could provide the necessary aggregate data, or with s251 approval, pseudonymised/record level data.

        We still don’t record socioeconomic status or pollution directly (although we have postcodes, obviously).

    • Primary care prescribing is currently a rather unstructured ‘hit and miss’ process. The medication dose, route and frequency are not computed and coded but remain as a free text blob of information. All medicines prescribed to patients should be linked to a BNF licensed indication (a flag should indicate an unlicensed use).
      Secondary care is (finally) moving forward (e-Prescribing) very quickly with structured and coded medication messages/records and there needs to be a common development roadmap to standardize medication record structure between primary and secondary care.
      If this isn’t done, seamless transfer/interoperability of medication information will remain very difficult if not impossible.

      • If you need a lot of records to get the required stastical power to ask a question, then approaching GP surgeries directly is impractical. You really need an existing agregate dataset.

        For this particular study, because presumably with a popular asthma treatment you’d get plenty of perscriptions of that drug from each GP surgery, it might be practical to do as you suggest. But if it’s a little used drug or a rare condition, you might need to deal with hundreds of GP surgeries, adding months of work to what could be a short project.

        The usual way of doing this kind of study is to take a big agregate GP dataset like the ones MHRA, NHSD, SystemOne/TPP sell, and then use NHSD to link it to Office Of National Statistics numbers for income and deprivation in the area where the patient lives. NHSD do the linking bit, so the drug company doesn’t get to see the PID. However, I’m sure you can see some flaws just from reading that – what post code you live in only says so much about your lifestyle personally.

        • Are you seriously saying that TPP sell patient data? I was told that they have access to GP records under a contreact signed by the Secretary of State, which requires them to sign a confidentiality agreement. We all know that confidentiality does not exist in the NHS, but you seem to be suggesting that that TPP is a source of patient level data that is regarded as legitimate. Could you suggest any source of corroboration of, or further information on this, to me, extremely shocking revelation?

        • Thank you for the information about researchone.org, Bob (there seems to be no link to reply in the right place). I am astounded and horrified to hear of this. De-identified never means what it says and TPP is the last organisation that should be allowed to extract patient data from GP records. So much for their confidentiality agreement. The duplicity involved here is breathtaking. The NHS allow patients to opt out of extraction of their data from GP records by NHS Digtal under Care.data, pretend to drop the whole scheme and silently allow TPP to extract, store and distribute the data instead. I am sure patients have never been told about this use of their records. It certainly isn’t in my GP’s Privacy Notice and all GPs in the area appear to have the same privacy notice, produced by the CCG. The deception and duplicity seems to be never ending and ever multiplying, always enabled by the blatant lie about data being de-identified when it isn’t.

          • Bertl, to help me understand your position better and inform the debate, would you please…

            a) read the information about researchone. I would recommend this page in particular (http://www.researchone.org/about/)

            b) explain why the research ethics committee should not have given the favourable opinion (The REC reference number is 11/NE/0184)

            c) explain why the National Information Governance Board (NIGB) review that deemed the data non-identifiable is wrong

            d) let us know what happens when you contact your CCG/GP to see what they say about meeting their obligations around privacy notices and whether they are following the opt-out process appropriately.

            e) set out how you would provide data for service improvement and research in a safer way.

            I am keen to understand how the NHS can do this better in the future to help improve the healthcare we provide, both clinically and operationally.

          • I have a simple question I would like someone to answer: is data in free text fields passed to TPP (or anyone else)? If it is, is it cleaned of potential PID before it is – names, postcodes, address etc.?

    • Presumably that’s why they wanted the full addresses. By cross-matching with Expedia (or similar) they could deduce to a high degree of accuracy the socio-economic status and proximity to sources of air pollution such as congested roads.

  • All prescribing and associated activity should be assigned to a pathway, U can not dump all activity in a single bucket, that makes no sense

Comments are closed.