How can we keep science honest in a world of open data?

Apr 14, 2016

Photo credit: PR Company Handout

By Dorothy Bishop

For years, researchers squirreled their data away after completing a study. When I started out in research in the 1970s, there were few options for sharing data: there was no email or internet. I have dim memories of analysing data from the 1958 National Child Development Study. The files arrived on enormous disks that I had to take to the local computer centre to read.

Now, though, we have ways of not just storing, but electronically sharing data. Archiving is not trivial: it requires proper documentation of data, and anonymisation when human participants are involved. But the advantages are clear to see: data in an archive can be re-used by other scientists, increasing its potential value. Data can also be future-proofed, avoiding the scenario where key results exist only on a kind of floppy disk that no longer can be read.

But as we move to wider data-sharing new questions arise. In particular, who should have access to the data? The simplest answer is everyone: the scientist could just put their data out there, and anyone and everyone could view it. In many areas, this is unproblematic, but some scientists have reservations about completely free access, even if they agree in principle with open data.

In some cases, there are concerns that data may be misused by people with conflicted interests or a specific ideological agenda. A few weeks ago, there was uproar when it was found that Robert de Niro planned to screen a film, Vaxxed, at the Tribeca Film Festival. The film highlights an analysis of data on autism and vaccination from a large US database (CDC) which claimed to find a greatly increased rate of autism in children who had been vaccinated, provided they were African-American boys vaccinated in a specific time window. It was argued that there was a conspiracy to cover up this shocking statistic, even though the analysis was clearly flawed, the results were discrepant with the rest of the literature, and the paper was subsequently retracted. It could be argued that overall, this was a win for the self-correcting process of science, because the errors in the analysis were quickly discovered, and when Robert de Niro was made aware of the concerns about the misinformation in the film, he withdrew it from the festival. But there’s no doubt that damage was done. Once conspiracy theories get established, they can be difficult to dislodge. From the point of view of anti-vaxxers, the withdrawal of the film just provides further evidence that there is a conspiracy to silence those who speak the truth.

Would the situation have been different if there had been restrictions on access to the data? Probably not. The problem is not so much who has the data, as what they do with it. A particular danger comes from unrestricted data-trawling of the kind that was evident in the CDC analysis. Although these dangers are especially serious when those doing the analysis are determined to find a particular result, they are not negligible when reputable and relatively open-minded scientists do secondary analyses.

Large datasets allow for analytic flexibility, and it is all too tempting to trawl a dataset for “significant” associations. Exploratory analysis is important for scientific progress, but inferential statistics lose their meaning if the researcher has selected which data to analyse on the basis of the observed results. One answer is to reproduce findings in a new dataset. An alternative is to require those analysing the data to specify in advance what analyses they plan to do – this is directly parallel to the idea of pre-registration of yet-to-be-done studies, which is beginning to gain traction in many areas of science as a way of improving reproducibility by distinguishing hypothesis-testing from exploratory analyses.

But how would we keep everyone honest? If we place restrictions on who has access to the data and what they do with it, we could end up with those who collected the data acting as gatekeepers. This runs the risk that if scientists themselves have conflict of interest or ideological agendas, they might deny access to others on spurious grounds.


Continue reading by clicking the name of the source below.

4 comments on “How can we keep science honest in a world of open data?

  • @OP – But as we move to wider data-sharing new questions arise. In particular, who should have access to the data?

    It seems there are those who really don’t want some data to be seen!

    http://www.bbc.co.uk/news/technology-36053673

    The university at the centre of a pepper spray row paid consultants more than $175,000 (£123,000) to bury online search results about the incident.

    In 2011, a police officer pepper-sprayed students protesting at UC Davis, California, at close range.

    The university later hired consultants to “eradicate references” to the incident in search results.

    UC Davis said it wanted the reputation of the university to be “fairly portrayed”.

    Videos of the incident, which have been viewed millions of times online, show a police officer pepper-spraying students who were peacefully protesting on the university campus.

    In a statement issued at the time, university Chancellor Linda Katehi said she was “deeply saddened” by the event and took “full responsibility for the incident” but refused to resign when challenged by the university’s academic staff association.

    “I do not think that I have violated the policies of the institution,” she said.

    Documents released after an investigation by local newspaper the Sacramento Bee found that the university hired consultancy Nevins & Associates in 2013 to “eliminate” Google search results.

    The consultants identified “online evidence” and “venomous rhetoric about UC Davis” was being shared online.

    The campaign was also designed to eliminate negative search results about Ms Katehi.

    Documents suggested this could be achieved with a “flood of content with positive sentiment and off-topic subject matter”, and proposed hosting content on Google’s own services, which would appear higher in the firm’s search results.

    Speaking to the Sacramento Bee, UC Davis spokeswoman Dana Topousis said: “We have worked to ensure that the reputation of the university, which the chancellor leads, is fairly portrayed.”

    The consultancy was paid by the university’s communications department. Its budget has increased from almost $3m in 2009, to $5.5m in 2015.



    Report abuse

  • Mathew 7

    7 Ask, and it shall be given you; seek, and ye shall find; knock, and it shall be opened unto you:
    8 For every one that asketh receiveth; and he that seeketh findeth; and to him that knocketh it shall be opened

    We have much to learn from religions. In this case it is a lesson on what can happen when human credulity and ignorance meets human ambition – leavened by an authority that presents industry as a cure for all that frustrates us.

    People uneducated in critical thinking are easily persuaded that hard work alone is required to succeed where others fail. Of course, if that were true, half of Africa would be populated by billionaires – the female half.

    The above quote from Mathew illustrates how this old, old, view is pressed into service by religions even to the extent of encouraging followers to work on their level of faith – on believing even in the absence of data.

    What happens to science when it is caught up in our ambition?

    Would the situation have been different if there had been restrictions on access to the data? Probably not. The problem is not so much who has the data, as what they do with it.

    Quite so.

    In this quote Dorothy Bishop repeats her previous answer to her own question:

    … who should have access to the data? The simplest answer is everyone: the scientist could just put their data out there, and anyone and everyone could view it. In many areas, this is unproblematic, but some scientists have reservations about completely free access, even if they agree in principle with open data …
    .
    … there are concerns that data may be misused by people with conflicted interests or a specific ideological agenda …

    Is this a new problem?

    No.

    Political lobbyists funded by vested interests have been taking other people’s research and mis-representing it, re-interpreting it through a dogmatic lens, or discrediting it through ad hominem, association, language, and logical fallacies of every kind (deliberately applied, it seems certain, in many cases) aided by those vested interest’s relationships with the engines of propaganda in the Media.

    I recently had an argument with a friend who questioned a statement in a British Government pamphlet that “3 million jobs [in Britain] are tied to membership of the European Union”. One would think that the word “tied” is unambiguous, but not to my friend – a regular reader of the anti-European Union rhetoric of a certain publisher for at least the last decade – he believes that the word is, indeed, highly ambiguous and probably deliberately misleading.

    This is a trivial example of the concern in Dorothy Bishop’s column, and it shows how those with a vested interest need not work all that hard to undermine real science or in this case a properly conducted economic study. If people have planted an idea in your head (in this case: EU bad) – Bishop’s focus is on conspiracy theories, but pretty much any belief is the same – it takes a significant effort of critical thinking to change your mind. Robert De Niro has apparently flip-flopped on the anti-Vaxxer film, and his uncertainty is a good example of how those unskilled in critical thinking have difficulty changing their minds.

    This is normal, we’re all clearly wired to only think deeply about things, about a course correction so to speak, every now and then – most of the time we’ve evolved to run on auto-pilot.

    Dorothy Bishop’s solution is, in précis, an intermediary – a Doorman to the club, a Night Guard at the museum.

    I don’t see how it gets us past what Bishop calls:

    … a breakdown of trust between scientists and their critics

    ?

    Very few people understand statistics, me included, but I’ve learned to trust some of them because I have learned to apply critical thinking (imperfectly).

    Anyone who has lost (or, indeed, never had) trust in scientists, it seems to me, is unlikely to be swayed by the idea that in intermediary skilled in the smoke and mirrors of stats can be their guide.

    I can think of three things that scientists can do to, at least, minimize the problem that Bishop outlines.

    Publish and be damned.

    Be ready to explain your work to non-scientists – a chore that many shirk, and some seem to treat as beneath them.

    Be an energetic political activist for the effectiveness of education particularly in science, logic and critical thinking.

    Like the arguments of lawyers, the arguments of vested interests don’t change – only the trust in who is presenting the truth, and who is spinning us a line.

    Peace.



    Report abuse

  • @OP – In some cases, there are concerns that data may be misused by people with conflicted interests or a specific ideological agenda.

    There are certainly some security related issues, involving explosives, drugs, alarm-systems, and actions to restrict criminal or terrorist activities, which should not be freely available to people who will misuse them against the wider public!



    Report abuse

  • http://www.bbc.co.uk/news/technology-36102959

    The chancellor of a university has apologised over the hiring of a PR firm that promised to bury online search results about the pepper spraying of peaceful protesters.

    Dr Linda Katehi admitted that UC Davis, California, brought in a company “specialising in what is known as search engine optimisation”.

    But she denied that the institution had sought to “rewrite history”.

    Dr Katehi has faced calls to resign over the 2011 incident and its fallout.

    Earlier this month, the Sacramento Bee newspaper reported that UC Davis hired the PR firm Nevins and Associates on a six-month contract at $15,000 (£10,400) per month.

    The university was seeking to deal with the reaction to the incident in 2011 in which students who were protesting on the university’s campus, near Sacramento, in California, were pepper-sprayed by a campus police officer.

    The paper published a document which it said set out the firm’s proposed strategy. The document read: “Nevins and Associates is prepared to create and execute an online branding campaign designed to clean up the negative attention the University of California, Davis, and Chancellor Katehi have received related to the events that transpired in November 2011.”

    The document also referred to “eradication of references to the pepper spray incident in search results on Google for the university and the Chancellor” via an “aggressive and comprehensive online campaign to eliminate the negative search results”.

    The Los Angeles Times reported last week that the University of California’s student association had called on Dr Katehi to resign over the news.

    It does seem more likely that she is sorry about being caight, rather than sorry for the attack and cover-up!



    Report abuse

Leave a Reply

View our comment policy.