Exploring the Purpose of Things: database

Showing posts with label database. Show all posts

Monday, July 21, 2025

The Urge to Persecute

People have always sought to project evil onto their neighbours, and that desire now extends to random strangers on the Internet. Malcolm Gaskill shows how the science of witch-hunting took a leap forward in the Enlightenment period, thanks to the meticulous assembly and analysis of data to confirm or confound hypotheses, and describes how one seventeenth century German woman was found innocent of witchcraft only after the intervention of her son, who was able to use these same tools in her defence. Of course it helped that her son happened to be one of the greatest intellectuals of the period, Johannes Kepler.

Empiricism made witchcraft possible as an actionable crime before it made it an impossible one. Kepler saved his mother through formidable concentration, sticking to a firm line of reasoning and dissecting his opponents’ arguments, point by point.

In this week's news, two tech executives were spotted cuddling one another at a Coldplay concert, drawing attention to themselves by ducking in a guilty fashion when they realized they were being shown on the big screen. Internet sleuths were able to discover their identity, public shaming ensued, and jobs and marriages were lost - an example of what Cathy O'Neil calls Networked Shame. In his commentary on the incident, Brandon Vigliarolo noted our willingness to persecute someone for a perceived wrong despite not knowing the full story.

Vigliarolo then went on to remind us of the eagerness with which other tech executives are pushing mass surveillance, which will apparently keep everyone on their best behavior through the use of constant real-time machine-learning-powered monitoring.

Because we can trust machine learning to know the full story before jumping to conclusions, can't we?

See also: Witnessing Machines Built in Secret (November 2017), Metrication and Demetrication (August 2021), The Purpose of Shame (April 2022)

Malcolm Gaskill, Money, Sex, Lies, Magic (London Review of Books, 38/13, 30 June 2016)

Malcolm Gaskill, Social media witch-hunts are no different to the old kind – just bigger (Guardian, 13 October 2016)

Cathy O'Neil, The Shame Machine (New York: Crown, 2022)

Jon Ronson, So You've Been Publicly Shamed (Picador 2015)

Geoff Shullenberger, The Scapegoating Machine (The New Inquiry, 30 November 2016)

Brandon Vigliarolo, Ellison declares Oracle all-in on AI mass surveillance, says it'll keep everyone in line (The Register, 16 September 2024)

Brandon Vigliarolo, Coldplay kiss-cam flap proves we’re already our own surveillance state (The Register, 18 Jul 2025)

Friday, November 24, 2023

Data and the Genome

The word data comes from the Latin meaning that which is given. So one might think it is entirely appropriate to use the word for our DNA, given to us by our parents, thanks to millions of years of evolution. DNA is often described as a genetic code; the word code either refers to the way biological information is represented in the molecular structure of chromosomes, or to the way these chromosomes can be understood as a set of instructions for building a biological entity. Watson and Crick used the word code in their 1953 Nature article.

However, when people talk about the human genome, they are often referring to a non-biological representation in some artificial datastore. In other words, given by biology to data science.

Shannon E French objects to talking about data stored on DNA like it’s some kind of memory stick, and Abeba Birhane sees this as part of the current trend that is so determined to present AI as human-like at all costs, describing humans in machinic terms has become normalised.

Elsewhere, Abeba Birhane is known for her strong critique of AI. As well as important ethical issues (algorithmic bias, digital colonialism, accountability, exploitation/expropriation), she has also raised concerns about the false promise of AI hype.

we have arrived at peak AI hype accompanied by minimal critical thinking
— Abeba Birhane (@Abebab) June 12, 2022

But describing humans (or other biological entities) in machinic terms, or treating them as instruments. is far older than AI. When we replace animals with technical devices (canaries. carrier pigeons, horses), the substitution implies that the animals had been treated as devices, the replacement often justified by the argument that technical devices are cheaper, more efficient, or more reliable, or don't require regular breaks - or are simply more modern. Conversely, when scientists try to repurpose DNA as a data storage mechanism, this also seems to mean treating biology in instrumental terms.

But arguably what is stored or encoded in the DNA - whether in its original biological manifestation or more recent exercises in bioengineering - is still data, regardless of how or for whom it is used.

Abeba Birhane, Atoosa Kasirzadeh, David Leslie and Sandra Wachter, Science in the age of large language models (Nature Reviews Physics, Volume 5, May 2023, 277–280)

Abeba Birhane and Deborah Raji, ChatGPT, Galactica and the Progress Trap (Wired, 9 December 2022)

Grace Browne, AI is steeped in Big Tech's 'Digital Colonialism' (Wired, 25 May 2023)

J.D. Watson and F.H.C. Crick, Genetical Implications of the Structure of Deoxyribonucleic Acid (Nature, 30 May 1953)

Related posts: Naive Epistemology (July 2020), Limitations of Machine Learning (July 2020), Mapping out the entire world of objects (July 2020), Lie Detectors at Airports (April 2022), Algorithmic Intuition (November 2023)

Thursday, October 16, 2008

Hedgehog Politics

According to Archilochus (via Isaiah Berlin), the fox knows lots of little things, and the hedgehog only knows one thing.

For the political hedgehog, nearly any problem can be linked to terrorism

And the solution to nearly every problem seems to be some kind of central database

Is that what they call asymmetric information warfare?

Friday, February 29, 2008

DNA and Crime 2

In a police state, anything that makes the police more effective is a Good Thing.

We are being bombarded with various measures (actual and proposed) that apparently make the police more effective. Longer detention-without-trial for terrorist suspects. CCTV evidence. And a national DNA database.

Proponents of these measures never fail to slip positive messages into the news media.

On the one hand, here's a terrible crime that was fortuitously solved many years later, *thanks to* the brilliant intervention of DNA scientists. On the other hand, here are some terrible crimes that may never be cleared up, *because* the relevant DNA wasn't recorded.

On the one hand, here is a wicked terrorist whom we were forced to release after a mere 28 days, although we *knew* he was plotting something terrible. On the other hand, here is another wicked terrorist, whom we were able to prosecute *because* the evidence just happened to emerge after a mere 45 days of investigation.

Opponents of these measures sometimes argue that they are ineffective or inaccurate. It is implausible to believe that evidence will suddenly appear after 28 days that was not available before. They say they will only agree to this measure if it can be shown that it sometimes works.

Other opponents argue that they are disproportionate. They do not deny that they may possibly work in a few cases, but claim that the benefits are grossly outweighed by the illiberal side-effects.

The problem with both of these lines of argument is that they are vulnerable to constructed refutation. Detection can be attributed to DNA for crimes that might possibly have been solved by other means. Suspected terrorists can be detained for the maximum permitted period, not just because the investigators are under less pressure to find evidence more quickly, but also because the investigators need to demonstrate that the currently permitted maximum is barely enough. Under certain conditions, the statistics could start to look very favourable, enough to overcome the "disproportionate" argument.

And the supporters of these measures have a further argument up their sleeve - the hypothetical deterrent effect. Imagine how many more crimes might have been perpetrated: would-be criminals who saw the cameras, or remembered the DNA held hostage in the police database, and decided to stay home and watch Big Brother instead. Imagine how many more people might have attempted to smuggle dangerous chemicals or stiletto heels onto aeroplanes, if it had not been for the constant vigilance of dedicated security screeners.

My point is this. Opponents of some specific measure may declare that the measure is unacceptable or counter-productive in a civilized society, may declare that the measure could only be accepted if such-and-such facts could be produced. And they may believe that this opposition is fairly solid, because these facts are extremely unlikely.

But what if the advocates of these measures are able to influence the facts? ...

Of course, I am not saying we are in a police state today. I am not even saying that specific measures would turn our country into a police state. All I am saying is that it is possible to see how repeated application of certain lines of argument could result in a police state.

Update

Here's some evidence to support my conjecture:

"Up against the buffers": Fact and fiction about the existing 28 day pre-charge detention limit (Liberty, 10 June 2008)
BBC News 10 June 2008 quoting Liberty Director Shami Chakrabarti

Wednesday, February 20, 2008

Disc Error

Last year, Netherlands police sent a CD to police forces around Europe, containing DNA gathered from thousands of crime scenes in the Netherlands.

The British police aren't actually paid to solve Dutch crimes, so they did what anyone would do - shove the CD in a drawer. When they eventually got around to looking at the data, they found a very small number of matches against the UK DNA database.

Opposition policiticians are making a ridiculous fuss about this. David Cameron describes it as a "catastrophic error of judgement", but that's rather unfair. Busy people have to juggle priorities. The police receive and collect huge amounts of data, and it is immensely costly to sift through and interpret it all. So what are they supposed to do, drop everything whenever their Dutch friends pop a CD in the post?

Of course, any scandal associated with mislaid data helps to remind people about privacy and security, and general Government incompetance, but that doesn't exactly seem to be the problem here.

The problem here is the belief that the existence of data establishes an imperative to DO SOMETHING with the data. And DNA data is "scientific", which makes this imperative all the stronger. This is exactly why many people don't like the idea of collecting DNA data in the first place.

Source: DNA disc failings 'catastrophic' (BBC News, Feb 20th 2008)

Monday, September 24, 2007

Big Bang DNA

A senior British judge argued earlier this month that a national DNA database should contain everybody or nobody [BBC News, September 5th 2007 - see also Robin Wilton and Scribe].

In a rational world, multiplying the costs and risks of a future project without matching benefits usually reduces the chances of the project's going ahead. But in the absurd world of public sector projects, it takes a lot more than outrageous costs and implausible benefits to kill a project.

One way to kill a project is to argue for its expansion. People may pretend to support your project, may suggest ways of making it even grander and more expensive, but their real agenda is sabotage - trying to make sure the project never happens. By making it large and complex, they hope to make it impossible.

Of course, we cannot always infer deliberate intention. Some people adopt the same tactic in innocent enthusiasm, so excited by the potential of an idea, that they do not realise that they are overloading it. And some people are driven by a perverse interpretation of systems thinking: obsessively trying to avoid the environmental fallacy, they fall into the opposite fallacy, which I call the Fallacy of Escalation.

For an example of project escalation, see Eberhard's classic story "The Warning of the Doorknob", which is frequently reproduced in software engineering circles (for example in this piece by Ed Yourdon). See also my own short essay In Praise of Scope Creep, which has also been widely reproduced.

John P. Eberhard, “We Ought to Know the Difference,” Emerging Methods in Environmental Design and Planning, Gary T. Moore, ed. Cambridge, Mass.: MIT Press, 1970, pp. 364-365.

See also Alëna Iouguina, Systems Design: Because everything* is systems (Shopify UX, 16 February 2017)

Saturday, May 05, 2007

Evil uses of DNA

How many evil uses for DNA can we think of?

Leave someone else's DNA at the scene of the crime. (Get a Saturday job sweeping up at the hairdresser.) (We are a nation of suspects)

Get an accomplice to leave your DNA hundreds of miles from the scene of the crime, in support of a false alibi. (Low Copy Number DNA testing)

If you don't have an identical twin, get yourself a clone. Then nobody will be able to prove which one of you did it. (Two quickies ...)

Steal someone's DNA and create a clone. Then threaten to commit all sorts of crimes unless they pay protection money.

Evade privacy protection by analysing the genetic patters of close relatives. (DNA as an abusable biometric)

If you know the genetic patterns of people who may be especially vulnerable to your evil schemes (the gene for "trust" perhaps), you can search the national DNA database looking for potential victims. (Five civil servants fined over DNA espionage)

... any more?

Especial thanks to Robin Wilton and the anonymous FishNChipPapers blogger.

Update: DNA Evidence can be fabricated (New York Times, 17 August 2009)

Wednesday, May 18, 2005

Surveillance and its Effects

Surveillance is a process of keeping people (such as customers and employees, as well as members of the public) under close supervision. What are the effects of surveillance? Here are two answers from an interesting blog (now called Into The Machine) whose main purpose seems to be to critique the authoritarian policies of the UK Home Secretary (past and present).

All CCTV monitoring does is lock down the public face of our nation, allowing us in our public capacity to simply sweep aside all the factors that lead to the crime and attitude we're experiencing every day. (The Two Faces of CCTV)
Surveillance will always produce nothing but underground revelry and a false sense of security. (The Ubiquity of Unnatural Surveillance)

[update: blog title and URLs changed, content looks the same]

It is clearly important to understand the effects on those being observed. But it is also interesting to note the effects on those doing (or relying upon) the observing.

Jeremy Bentham’s panopticon was originally a prison so designed that the warder could watch all the prisoners at the same time. By extension, this term is used to describe any technical or institutional arrangement to watch/ monitor large numbers of people. It forms part of Foucault's analysis of discipline, and provides a useful metaphor for various modern technologies

CCTV
workforce monitoring
database systems such as customer relationship management (CRM)
Google

The panopticon provides surveillance, and may result in a loss of privacy for the people being watched / monitored, but may also make people feel they are being looked after (better quality of service, safer). If you know you’re being watched, this may trigger various feelings – both positive and negative.

Besides the impact on the people being watched, the pantopticon often has an adverse effect on the watcher. The panopticon gives the illusion of transparency and completeness – so the watcher comes to believe three fallacies

that everything visible is undistorted truth
that everything visible is important
that everything important is visible

This is one of the reasons why surveillance mechanisms often become dysfunctional even for those doing the surveillance. For example, instead of customer relationship management (CRM) promoting better relationships with the customer, it becomes a bureaucratic obsession with the content of the customer database.

See also Surveillance 2 (September 2005)

Exploring the Purpose of Things

Pages

Monday, July 21, 2025

The Urge to Persecute

Friday, November 24, 2023

Data and the Genome

Thursday, October 16, 2008

Hedgehog Politics

Friday, February 29, 2008

DNA and Crime 2

Update

Wednesday, February 20, 2008

Disc Error

Monday, September 24, 2007

Big Bang DNA

Saturday, May 05, 2007

Evil uses of DNA

Wednesday, May 18, 2005

Surveillance and its Effects

Related blogs

Creative Commons License

or by email

Pages

Monday, July 21, 2025

Friday, November 24, 2023

Thursday, October 16, 2008

Friday, February 29, 2008

Update

Wednesday, February 20, 2008

Monday, September 24, 2007

Saturday, May 05, 2007

Wednesday, May 18, 2005

Related blogs

Creative Commons License

Subscribe To

or by email