Author Archives: Stephen Brett

About Stephen Brett

I qualified as a solicitor in 1997 and joined Anderson Law LLP in 2011, becoming partner in 2019. My focuses are technology transfer and data protection law. Most of my work is with research funders and NHS Trusts but I also regularly advise technology rich SMEs.

Data is the new oil: is that true for research data?

The Economist: May 2017

Charles Humby (he of Tesco Clubcard fame) is credited with coining the data as oil analogy in 2006.  Then it was given a boost by the Economist. There are definite parallels between the costs of turning crude oil into a garage forecourt product and the effort necessary to turn raw data into the fuel that powers the data revolution.  The analogy has survived because it is good.  So good that it is now something of a cliché, as all-pervading as the light bulb is as shorthand for all things innovative. 

My curiosity is whether the analogy stands for data that is collected as part of an academic research project, particularly routinely collected health data.

Expensive stuff, data

The analogy catches one of the expensive problems with data.  Wiser heads than mine have mentioned that you can’t own data.  The analogy implicitly acknowledges that data, like the sticky black stuff that comes out of the ground, needs work before it becomes the user-friendly tool on which theories can be based and algorithms tested.  That work takes time, resource and ultimately, money.  Hydrocarbons need to be extracted from unforgiving parts of the planet.  Data needs to be harvested from people or pulled out of record systems or databases that can be labyrinthine (try getting two NHS Trust IT systems to talk to one another) or just plain muddy (put too much data into a datalake without paying attention and the datalake starts to look like plasticine at the end of a toddler group session: a brownish-purple blob with few distinguishable features).  Where crude oil needs refining, data needs contextualising.  In its crude state, data is noisy and messy: illegible, out of date, incomplete or in the wrong format.  Routinely collected health data often isn’t collected or organised in a way that makes the switch to a research use easy.

How far can you stretch an analogy?

Stretching the data as oil analogy, where extracting and refining oil carries risk so the use of data can be a minefield.  In the oil industry, stuff can blow up.  Data does not (always) pose the same immediate threat to life and environment but it does carry risks: where data is used in a clinical setting there is a very real risk to life.  If you read the data wrong, if you fail to correct inherent bias in the data or in the process of collecting the data then your predictions and algorithms will be a massive fail.  Legally speaking, data use is dependent on permissions.  Exceeding those permissions risks infringement actions and damages.  Data protection imposes restrictions and obligations that are not quick, simple or cheap to comply with.  Failure to comply risks regulatory action, damages claims and/or a public backlash.  There have been two attempts to use NHS data on a national scale: the first failed miserably in the face of transparency and confidentiality issues (remember; the second (GP Data for Planning and Research) has been postponed to an unspecified start date having been heavily criticised for …. a lack of transparency.  [Lack of] trust is a massive issue.

Breaking the analogy, there are two aspects of research that do not map onto the process of refining crude oil.

Refining crude is a one way process.  Oil cannot be unrefined and reused.  Refined data can be interrogated and broken down and then reused.  That is how analysts seek to understand anomalies: go back to the source.  It doesn’t always work – refining data often involves removing parts of the data and that loss of data may render it impossible to reverse the process. 

The costs of refining crude into petrol are predictable: forecourt pricing can be broken down:

Breakdown of fuel pump prices (% as at 8 Nov 2021 based on RAC data)

The costs of refining data are harder to predict: not all processes are equal or equally effective.

This highlights a big problem: how do you value data?  Spoiler alert, I have no answer to that.  I can offer two observations.

Be realistic. 

When it comes to valuing data, there are extreme views and nuances in between. 

One extreme view: raw data is in such a mess that the user has to spend so much time and so much money sorting it out that there is no sense in the user paying for it, even though once the user has sorted it out it will be commercialised with enthusiasm.  That view is bonkers.  If you rely on another source to supply the data then you need to accept that that source will need to see a return. 

Another extreme view: data has so much potential value in research terms and in financial terms that no sensible provider should release it unless the user is prepared to pay handsomely and upfront.  That view is less bonkers but only if the price charged is calculated based on the cost of extracting the data.  Want to price your data according to the value that will be realised from it?  You need a crystal ball.  Or an algorithm (ironically).  Valuation is commonplace conundrum in the research world: my material (cell line, antibody, tissue sample, data) forms a tiny part of your process.  But your process cannot progress without my material.  Hence observation one: be realistic.  It takes money and resource both to refine data to a usable product and to collect the data in the first place.  In many cases, access to data becomes as valuable as the data itself.  There is a reason that AI companies fight to forge links with the NHS.  Providers and users each need to recognise that the other has incurred costs already (collecting data or building a business) and that each is contributing a valuable asset (raw data or the expertise and resource to refine and use it).  Each needs to be realistic in their expectations of return (‘the more resource I have to commit, the less I will pay for access’) but also realistic enough to recognise that each needs what the other has.

Do you need to value the data itself?

Logically, if you can’t own data, you can’t sell data so you don’t need a notional price per data unit.  Focus on selling the things around the data: sell the right to access and use the data by way of granting licences to database right or copyright or by using confidentiality as a controlling mechanism.  You can assess and charge for the resource you have expended in collecting and cleaning (mining and refining) the data.  The real money isn’t in the data.  It is in the data services.

So, does the analogy work for research data?

Does the ‘data is oil’ analogy work for research data including routinely collected health data used in research?  Only up to a point.  Tesco’s Clubcard is nothing if not an exercise in research: collecting data and spotting trends.  Crude oil and crude data are both difficult to mine and refine.  But there are key differences: the cost of extracting and refining oil is easier to quantify and the price of the end product is easier to predict.  The cost of extracting and refining data are much harder to quantify and the end product is much harder to value.  The money is in the services not the product, especially in a health care setting where selling data is still at least a mild taboo.

“Once the data is collected, it will only be used for the purposes of improving health and care. Patient data is not for sale and will never be for sale.”

[Extract from letter from DHSC (Jo Churchill MP) to GPs about GPDPR – 19 July 2021. Available here]

Leave a comment

Filed under Uncategorized

A masterclass in negotiation

This article comes from my colleague Stephen. Someone else writing an article on this blog is a sufficiently rare event that I thought I should mention it (IPDraughts)

I recently completed a six week masterclass in negotiation tactics – aka the school summer holidays.  One unexpected takeaway for me has been realising the similarities between two negotiation bibles. 

Getting to Yes comes from the Harvard Negotiation Project and cites weighty strategic examples such as SALT disarmament talks and the Egypt-Israeli war.  It sits on all good corporate book-shelves.  How to Talk So Little Kids Will Listen is practical help for parents.  Styled as a survival guide it is written by two Mums from America – albeit Mums who are acknowledged experts and ‘parent educators’. 

Reflecting, what has struck me is the crossover.  In many ways these two books say essentially the same thing.

GtY: academics on negotiation

GtY – principled negotiation

GtY espouses principled negotiation over positional bargaining.  Negotiators should avoid the hard or soft game.  In that game, some will focus on looking for the answer they will accept (soft) or for the answer you will accept (hard); others will always be flexible and yield to pressure (soft) while others dig in and resist pressure (hard).  Instead, GtY puts the focus on the people and their interests.  Understand those and you will see a way through.  GtY’s thinking is that conflict is often rooted in the subjective views that exist inside people’s heads.  So, give the participants a stake in the outcome in order to minimise conflict.  Be hard on the problem but not on the people.  Avoid threats because they just spiral and things soon get out of control.  Spend time working together to brainstorm inventive solutions.  If things become heated or stuck, change the environment – move to a neutral space, refresh the context.  Consider every suggestion but settle on an approach that uses objective criteria because that will lead to a fair solution.

HtT – resolving conflict

HtT: parents on negotiation

HtT focusses on resolving conflict.  There is much conflict involved in parenting – even getting past breakfast can feel like enough discussion, negotiation and (often) screaming to last a whole week.  HtT says parents should avoid cycles of punishment and reward because they escalate and rapidly become unrealistic.  Instead, problem solve together.  HtT suggests working together to brainstorm a list of possibilities.  Everyone involved has a chance to explain what really worries them and what they are really after and why.  Everyone has the opportunity to add suggestions to a list (actually writing a list is important) however bonkers those suggestions might seem.  As a result, everyone involved has had input to the proposed solution and everyone feels engaged.  And, if (when) discussions become too intense, HtT urges parents to de-escalate by managing the environment not the child. 

How to talk to get to yes…

As I see it, there are some common themes emerging:


Talk and listen, don’t just shout.  Realistically, how often has shouting at your child (or your business partner) really worked?  Even if it got you where you wanted to be, chances are the relationship suffered.  GtY says the negotiator should avoid positional bargaining and avoid the traditional hard or soft tactics.  HtT says the parent should focus on understanding what people are saying (and why) and on resolving conflict.  Children don’t go ‘oh, well now you’re shouting of course I will comply’.  But, if given the chance, they do sometimes say ‘I’m not doing it because I am really worried about this’ allowing you to make ‘this’ go away and for peace to return. 

Look for solutions together.  GtY suggests brainstorming inventive solutions.  Negotiation should be about merits not positions.  They should give the participants a stake in the outcome by involving them in designing the solution.  HtT proposes that parents should exploit the positive possibilities of problem solving together with their children: make lists of suggestions together.  Any suggestion can be added onto the list however ludicrous it may seem.  That way, everyone is heard and everyone is engaged.

Don’t make threats.  GtY and HtT both emphasise how easily and quickly threats and rewards mushroom and take on a life of their own.  With children, the threat of restricted screentime can quickly aggregate to the extent that, if followed through, there would be no screentime for about the next four years.

It is a distraction but is it cake?

The environment matters.  GtY says that changing the environment can promote calm and progress.  HtT urges parents to manage the environment not the child.  If the cake is a distraction, hide the cake.

Of course, there are limits to this comparison.  I doubt many children start by considering their BATNA before they grab the TV remote (‘I want Paw Patrol, I’ll settle for David Attenborough but anything less and I’m going to play with the Lego instead’).  GtY sees the BATNA as the key to ensuring that you don’t accept a bad deal.  But this is a blog post not a doctoral thesis.  Quite possibly this post simply highlights an encouraging curiosity: in getting the kids out of the door to school on time (with their shoes on the correct feet) or in getting the kids to turn off (or even pause) YouTube for long enough to eat supper, parents are cementing their position as top-flight negotiators.  Harvard Negotiation Project, here we come!


Filed under Uncategorized

Data consents: lets get granular

T201802 Sugar adhis blogger has previously discussed some of the difficulties in relying on consent as a justification for lawful processing under GDPR, but these difficulties bear closer examination.  First, the basics.  Then some thoughts on the use of consent in the research world and whether it is always needed.

The basics

Consent is one of the six lawful bases that justify the processing of personal data.  To be adequate, consent must be a freely given, specific, informed and unambiguous indication of the individual’s wishes by a statement or clear affirmative action – granular is the word the regulators use.  It is not silence or a pre-ticked opt-in box.  It is not a blanket acceptance of a set of terms and conditions that include privacy provisions.  It can be ‘by electronic means’ – it could be a motion such as a swipe across a screen.  But, where special category data (sensitive data such as health data) are processed and explicit consent is needed, this will be by way of a written statement.

The data controller must be able to demonstrate consent.   This goes to accountability – the controller is responsible for demonstrating compliance across the piece although GDPR does not mandate any particular method.

Consent must be requested in an intelligible and easily accessible form and must be clearly distinguishable from other matters.  The request cannot be bundled up and appear simply as one part of a wider set of terms.  When the processing has multiple purposes, consent should be given for each of them – granularity again.  Conflated purposes remove freedom of choice.

Consent must be freely given.  It must be a real choice.  Employers will always find it hard to show that their employees have consented freely, for example.  The choice needs to be informed.  Without information, any choice is illusory (the transparency principle).  As a minimum, the informed individual would need to know: the controller’s identity; the purpose of the processing; the data to be collected and used; and, that consent can be withdrawn.

It must be as easy to withdraw consent as it was to give it.  This doesn’t necessarily mean that withdrawal must be by the same action (swipe to consent and withdraw) but rather that withdrawal must be by the same interface (consent via the website, withdraw via the website).  After all, switching to another interface would involve ‘undue effort’ for the individual.  If consent is withdrawn, the individual must not suffer any detriment.

With pleasing circularity, demonstrating that withdrawal carries no cost and no detriment (meaning no significant negative consequences) helps to demonstrate that the consent itself has been freely given.

Consent in research world

Getting granular consent (meaning consent specific to a given purpose) can be repetitive.  Bundling up different consents in one is not allowed so multiple purposes make for long lists of consents and the risk of consenting fatigue.  Other lawful bases may be more convenient and consent should not be the default or unthinking route for controllers.  Aside from the high threshold for adequate consent, the GDPR’s transparency agenda means that there is a strong argument that if consent is given as the lawful basis at the outset there can be no substitution of a different legal basis if consent is withdrawn.

Getting granular consent can be difficult.  GDPR recognises that it may not be possible to fully identify the purpose of scientific research processing at the point of data collection and acknowledges that individuals could consent only to certain areas of research.  GDPR’s principles are relaxed for the benefit of scientific research but they continue to apply.  The purpose of the processing must still be described but it is enough for the research purpose to be ‘well described’ rather than specific.  Transparency is a safeguard where specific consent is not possible.  Research plans should be available.  Consent should be refreshed as the research progresses.

Consent must be freely given.  Does a research participant have a free choice?  Probably yes, if the intended processing is not arbitrary or unusual and if the information provided is adequate and accurate.  An informed refusal to join a clinical trial will not lead to standard treatment being withdrawn so there is no detriment.  But what if the standard treatment is not working?  If the individual has to consent to arbitrary processing of their personal data in order to take what may be their only remaining hope then it is difficult to see that as a free choice.

Consent can be withdrawn.  Researchers have some comfort in that processing that has already been carried out remains legitimate after consent is withdrawn.  But further processing must stop which threatens the ongoing research project, unless the data can be disentangled.  To make matters worse (for the researcher), if there is no other legal basis for holding the data then it may be necessary to delete it – more difficult disentangling, especially if the individual forces deletion through their right to be forgotten.

What can the worried researcher do about the risk of withdrawal?  Anonymise the data and carry on is always a good answer.  Rely on a different legal basis in the first place (and carry on) is another good answer.

Sidestepping the issue by making the consent irrevocable is not a good answer: it would breach the requirement that consent can be withdrawn at any time.

A sneaky lawyer’s answer may be to embrace the requirement that consent must be as easy to withdraw as to give.  If changing formats involves ‘undue effort’ then avoid electronic means and require consent to be in writing.  This answer is not guaranteed by any stretch of the imagination: the data controller is essentially betting that few will bother to put pen to paper to withdraw.

Clearly GDPR consent is a troublesome beastie but there is one strong point in its favour.  Using consent as the legal basis for processing promotes trust.  Repeatedly refreshing that consent as the research progresses reinforces trust.  Trust makes the engagement stronger.  Perhaps the prize of a stronger and more committed and engaged research cohort based on consent is worth it?

1 Comment

Filed under Databases

Using personal data in research: all change….?

Pondering, as one does, the likely impact of the General Data Protection Regulation on one’s working life, this Blogger has been trying to figure out how simple it will be to use personal data for research purposes (especially research in healthcare) after 25th May 2018 – the day on which the GDPR comes into force.  Before you ask, whatever happens with Brexit, the timing is such that the GDPR will come into force in the UK.

The GDPR is similar and yet different to the present Data Protection Act.  Similar in that the use of personal data is still governed by a series of principles and that processing of personal data must have a lawful basis.  Different in the detail of the duties placed on data controllers and processors, the rights granted to individuals and the justifications available to show that data is being processed lawfully.  For now, this Blogger is focussing on the research use context.

20171014 Latitude

Oxford is 51.7520° N

The GDPR allows some latitude for research uses.  ‘Latitude’ is not the same as ‘get out of jail free’.  If research users apply appropriate safeguards and data minimisation (limiting any processing to the extent necessary for the particular purpose) then some of the individual’s rights may be excluded.  But the core principles of the GDPR still apply.

Today, consent is the researcher’s go to justification for processing personal data.  Under the DPA and the GDPR, processing is lawful if the individual has given consent.  However, GDPR consent is a different animal to DPA consent.  The GDPR sets higher standards in terms of information (specific uses and specific recipients should be listed) and record keeping.  The GDPR is clear that it must be as easy to withdraw as to give consent – potentially really troublesome for a research project.  The ICO’s draft guidance talks of obtaining granular consent that describes in advance all the proposed uses of the personal data and everybody who will have access to the personal data.  The consent will have to be specific and records comprehensive.  Under the DPA a researcher can be (fairly) comfortable with wording consenting to the use of personal data for a defined project ‘and other related research’.  Under the GDPR, the researcher will have to describe the project (ie the intended use) and list all those that will have access to the personal data and explain which other projects the personal data may be used for.  In effect, ‘if you’re not on the list, you’re not coming in’.  Thankfully, a pragmatic ICO recognises that not all future research uses can be specified in advance and the guidance allows some scope to ‘do the best you can’.

The result of these changes?  From the morning of 25th May2018, existing consents may be rendered inadequate. 20171014 Exclamation

Can you hear the sounds of the research based economy grinding to a halt?  Be afraid, but not petrified.  Other possible means of demonstrating that processing has a lawful basis may be available.

First possibility is legitimate interest: GDPR treats processing as lawful to the extent that it is necessary for the purposes of the legitimate interests of the data controller as balanced against the impact on the individual concerned.  An interest is the broader aim or stake that the controller has in the processing.  It does not need to be described in advance but it will need to fall within the reasonable expectations of the individual.

The problem for healthcare research is that sensitive personal data (classified under GDPR as a ‘special category’), can only be processed where one of a separate list of exemptions applies.  The special categories include data concerning health.  This separate list of exemptions does not include legitimate interest: the legitimate interest justification does NOT justify the use of health data for research purposes.

Second possibility is that processing a special category of data is permitted where it is necessary for scientific research conducted in accordance with appropriate safeguards and where use of the data is proportionate to the research aim.  Useful but the emphasis is on ‘necessary’, ‘appropriate safeguards’ and ‘proportionate’.

A third possibility is to use anonymous data.  Like the DPA, the GDPR only applies to data relating to an identified or identifiable individual.  Currently, individuals do not have to give their consent for their personal data to be anonymised.  So, anonymise the data and all your problems fall away.

20171014 Stig

Anonymous or not…?

Inevitably, it is not that easy.  How anonymous does the data have to be before it no longer relates to a living and identifiable individual?  Today’s test is whether the anonymization process is robust enough to be likely to defeat the efforts of the Motivated Intruder (about whom this Blogger has mused before).  The problem is that big data makes more things are possible.  More pieces of the jigsaw are available to be found and linked together.  The Motivated Intruder doesn’t have to try too hard.

Despite its difficulties, consent may still be a useful possibility.  The GDPR permits processing of special category data where the individual has given explicit consent for a specified purpose.  The granular nature of consent has already been considered: proposed uses must be specified in advance.  In addition, the consent cannot be coerced – an outcome cannot be conditional on consent being given.  This may be a problem for commercial providers (‘you can only use this service if you give me all your personal data’).

20171014 Pencils

A simple answer: the Russians took a pencil…

It is less likely to be a problem in research world.  Does ‘you must consent if you want to participate in this clinical trial’ amount to imposing a condition?  Probably not.  Research is not the provision of goods or a service.  But the problem remains that it must be as easy to withdraw consent as it was to give consent in the first place.  Consent is not a simple answer.

Clearly, researchers (and their admin support!) will have to plan carefully to comply with GDPR.  That is not a Bad Thing: behind every data point there is an individual who deserves protection.  In any case, facing more detailed provisions is not the same as being prevented from performing research.  The GDPR is an intricate piece but, like eating an elephant, it can be dealt with in small chunks.  So, as a starting approach for those wishing to use personal data in their research:

First, establish what data it is that you wish to process.   Do you need to process all of it (data minimisation)?  Could you use anonymous data instead?

Second, establish whether it is a special category of data (eg health data) and if so, whether the intended use is permitted by any of the available exemptions:  including necessary for scientific research, consent (granular) or legitimate interest (but not for health data).

Third, if it is not a special category of data, or, if it is a special category but there is an exemption available, then check that the proposed processing is lawful.  Essentially that means demonstrating that Article 6 of the GDPR is satisfied.  That is worthy of a separate blog post in itself…


Leave a comment

Filed under Intellectual Property, universities