Stanislaw Lem, Herbert Simon and artificial intelligence as broad social technology project (Man / Machine X)

Why do we develop artificial intelligence? Is it merely because of an almost faustian curiosity? Is it because of an innate megalomania that suggests that we could, if we want to, become gods? The debate today is ripe with examples of risks and dangers, but the argument for the development of this technology is curiously weak.

Some argue that it will help us with medicine, and improve diagnostics, others dutifully remind us of the productivity gains that could be unleashed by deploying these technologies in the right way and some even suggest that there is a defensive aspect to the development of AI — if we do not develop it, it will lead to an international imbalance where the nations that have AI will be akin to those nations that have nuclear capabilities: technologically superior and capable of dictating the fates of those countries that lag behind (some of this language is emerging in the on-going geo-politicization of artificial intelligence between The US, Europe and China).

Things were different in the early days of AI, back in the 1960s, and the idea of artificial intelligence was actually more connected then with the idea of a social and technical project, a project that was a distinct response to a set of challenges that seemed increasingly serious to writers of that age. Two very different examples support this observation: Stanislaw Lem and Herbert Simon.

Simon, in attacking the challenge of information overload – or information wealth as he prefers to call it – suggests that the only way we will be able to deal with the complexity and rich information produced in the information age will be to invest in artificial intelligence. The purpose of that, to him, is to help us learn faster – and if we take into account Simon’s definition of learning as very close to classical darwinian adaptation, we realize that for him the development of artificial intelligence was a way to ensure that we can continue to adapt to an information rich environment.

Simon does not call this out, but it is easy to read between the lines and see what the alternative is: a growing inability to learn, to adapt that generates increasing costs and vulnerabilities, the emergence of a truly brittle society that collapses under its own complexity.

Stanislaw Lem, the Polish science fiction author, suggests a very similar scenario (in his famously unread Summa Technologiae), but his is more general. We are, he argues, running out of scientists and we need to ensure that we can continue to drive scientific progress, since the alternative is not stability, but stagnation. He views the machine of progress as a homeostat that needs to be kept in constant operation in order to produce, in 30 year increments, a doubling of scientific insights and discoveries. Even if we, he argues, force people to train as scientists we will not be able to grow fast enough to respond to the need for continued scientific progress.

Both Lem and Simon suggest the same thing: we are facing a shortage of cognition, and we need to develop artificial cognition or stagnate as a society.

*

The idea of a scarcity or shortage of cognition as a driver of artificial intelligence is much more fundamental than any of the ideas we quickly reviewed in the beginning. What we find here is an existential threat against mankind, and a need to build a technological response. The lines of thought, the structure of the argument, here almost remind us of the environmental debate: we are exhausting a natural resource and we need innovation to help us continue to develop.

One could imagine an alternative: if we say that we are running out of cognition, we could argue that we need to ensure the analogue of energy efficiency. We need cognition efficiency. That view is not completely insane, and in a certain way that is what we are developing through stories, theories and methods in education. The connection with energy is also quite direct, since artificial intelligence will consume energy as it develops. A lot of research is currently being directed into the question of the energy consumption of computation. There is a boundary condition here: a society that builds out its cognition through technology does so at the cost of energy at some level, and the cognition / energy yield will become absolutely essential. There is also a more philosophical point around all of this, and that is the question of renewable cognition, sustainable cognition.

Cognition cost is a central element in understanding Simon’s and Lem’s challenge.

*

But is it true? Are we running out of cognition? How would you measure that? And is the answer really a technological one? What about educating and discovering the talent of the billions of people that today live in poverty, or without any chance of an education to grow their cognitive abilities? If you have a 100 dollars – what buys you the most cognition (all other moral issues aside): investing in developmental aid or in artificial intelligence?

*

Broad social technological projects are usually motivated by competition, not by environmental challenges. One reason – probably not the dominating one, but perhaps a contributing factor nonetheless – that climate change seems to inspire so little action in spite of the threat is this: there is no competition at all. The world is at stake, and so nothing is at stake relative to one another. The conclusion usually drawn from that observation is that we should all come together. What ends up happening is that we get weak engagement from all.

Strong social engagement in technological development – what are the examples? The race for nuclear weapons, the race for the moon. In one sense the early conception of the project to build artificial intelligence was as a global, non-competitive project. Has it slowly changed to become an analogue of the space race? The way China is now approaching the issue is to some reminiscent of the Manhattan project style. [1]

*

If we follow that analogy for a bit further — what comes next? What is the equivalent of the moonlanding for artificial intelligence? Surely not the Turing test – it has been passed multiple times in multiple versions, and as such has lost a lot of its salience as a test for progress. What would then be the alternative? Is there a new test?

One quickly realizes that it probably is not the emergence of an artificial general intelligence, since that seems to be decades away, and a questionable project at best. So what would be a moon landing moment? Curing cancer (too broad, many kinds of cancer)? Eliminating crime (a scary target for many reasons)? Sustained economic growth powered by both capital investment strategies and deployment of AI in industry?

An aside: far too often we talk about moonshots, without talking about what the equivalent of the moonlanding would be. It is one thing to shoot for the moon, another to walk on it. Defined outcomes matter.

*

Summing up: we could argue that artificial intelligence was conceived of, early on, as a broad social project to respond to a shortage of cognition. It then lost that narrative, and today it is getting more and more enmeshed in a geopolitical, competitive narrative. That will likely increase the speed with which a narrow set of applications develop, but there is still no single moonlanding moment associated with the field that stands out as the object of competition between the US, EU and China. But maybe we should expect the construction of such a moment in medicine, military affairs or economics? So far, admittedly, it has been games that have been the defining moments – tic-tac-toe, chess, go – but what is next? And if there is no single such moment, what does that mean for the social narrative, speed of development and evolution of the field?

 

[1] https://www.technologyreview.com/s/609038/chinas-ai-awakening/

Notes on attention, fake news and noise #4: Jacques Ellul and the rise of polyphonic propaganda part 1

Jacques Ellul is arguably one of the earlier and most consistent technology critics we have. His texts are due for a revival in a time when technology criticism is in demand, and even techno-optimists like myself would probably welcome that, because even if he is fierce and often caustic, he is interesting and thoughtful. Ellul had a lot to say about technology in books like The Technological Society and The Technological Bluff, but he also discussed the effects of technology on social information and news. In his bleak little work Propaganda: The Formation of Men’s Attitudes (New York 1965(1962)) he examines how propaganda draws on technology and how the propaganda apparatus shapes views and opinions in a society. There are many salient points in the book, and quotes that are worth debating.

That said, Ellul is not an easy read or an uncontroversial thinker. Here is how he connects propaganda and democracy, arguing that state propaganda is necessary to maintain democracy:

“I have tried to show elsewhere that propaganda has also become a necessity for the internal life of a democracy. Nowadays the State is forced to define an official truth. This is a change of extreme seriousness. Even when the State is not motivated to do this for reasons of actions or prestige, it is led to it when fulfilling its mission of disseminating information.

We have seen how the growth of information inevitably leads to the need for propaganda. This is truer in a democratic system than in any other.

The public will accept news if it is arranged in a comprehensive system, and if it does not speak only to the intelligence but to the ‘heart’. This means, precisely, that the public wants propaganda, and if the State does not wish to leave it to a party, which will provide explanations for everything (i.e. the truth), it must itself make propaganda. Thus, the democratic State, even if it does not want to, becomes a propagandist State because of trhe need to dispense information. This entails a profound constitutional and ideological transformation. It is, in effect, a State that must proclaim an official, general, and explicit truth. The State can no longer be objective or liberal, but is forced to bring to the overinformed people a corpus intelligentiae.”

Ellul says, in effect that in a noise society there is always propaganda – the question is who is behind it. It is a grim world view in which a State that yields the responsibility to engage in propaganda yields it to someone else.

Ellul comments, partly wryly, that the only way to avoid this is to allow citizens 3-4 hours to engage in becoming better citizens, and reduce the working day to 4 hours. A solution he agrees is simplistic and unrealistic, it seems, and it would require that citizens “master their passions and egotism”.

The view raised here is useful because it clearly states a view that sometime seems to be underlying the debate we are having – that there is a necessity for the State to become an arbiter of truth (or to designate one) or someone else will take that role. The weakness in this view is a weakness that plagues Ellul’s entire analysis, however, and in a sense our problem is worse. Ellul takes, as his object of study, propaganda from the Soviet Union and Nazi-Germany. His view of propaganda is one that is largely monophonic. Yes, technology still pushes information on citizens, but in 1965 it did so unidirectionally. Our challenge is different and perhaps more troubling: we are dealing with polyphonic propaganda. The techniques of propaganda are employed by a multitude of parties, and the net effect is not to produce truth – as Ellul would have it – but eliminate the conditions for truth. Truth no longer become viable in a set of mutually contradictory propaganda systems, it is reduced to mere feelings and emotions: “I feel this”. “This is my truth”. “This is the way I feel about it”.

In this case the idea that the state should speak too is radically different, because the state or any state-appointed arbiter of truth just adds to the polyphony of voices and provides them with another voice to enter into a polemic with. It fractures the debate even more, and allows for a special category of meta-propaganda that targets the way information is interpreted overall: the idea of a corridor of politically correct views that we have to exist within. Our challenge, however, is not the existence of such a corridor, but the fact that it is impossible to establish a coherent, shared model of reality and hence to decide what the facts are.

An epistemological community must rest on a fundamental cognitive contract, an idea about how we arrive at facts and the truth. It must contain mechanisms of arbitration that are institution in themselves, independent of political decision making or commercial interest. The lack of such a foundation means that no complex social cognition is possible. That in itself is devastating to a society, one could argue, and is what we need to think about.

It is no surprise that I take issue with Ellul’s assertion that technology is at the heart of the problem, but let me at least outline the argument I think Ellul would have to deal with if he was revising his book for our age. I would argue that in a globalized society, the only way we can establish that epistemological, basic foundation to build on is through technology and collaboration within new institutions. I have no doubt that the web could carry such institutions, just like it carries the Wikipedia.

There is an interesting observation about the web here, an observation that sometimes puzzles me. The web is simultaneously the most collaborative environment constructed by mankind and the most adversarial. The web and the Internet would not exist but for the protocol agreements that have emerged as its basis (this is examined and studied commendably in David Post’s excellent book Jefferson’s Moose). At the same time the web is a constant arms race around different uses of this collaboratively enabled technology.

Spam is not an aberration or anomaly, but can be seen as an instance of a generalized, platonic pattern in this space. A pattern that recurs through-out many different domains and has started to climb the semantic layers from simple commercial scams to the semiosphere of our societies, where memes compete for attention and propagation. And the question is not how to compete best, but how to continue to engage in institutional, collaborative and, yes, technological innovation to build stronger protections and counter-measures. What is to disinformation as spamfilters are to unwanted commercial emails? It is not mere spamfilters with new keywords, it needs to be something radically new and most likely institutional in the sense that it requires more than just technology.

Ellul’s book provides a fascinating take on propaganda and is required reading for anyone who wants to understand the issues we are working on. More on him soon.

Notes on attention, fake news and noise #1: scratching the surfaces

What is opinion made from? This seems a helpful question start off a discussion about disinformation, fake news and similar challenges that we face as a society. I think the answer is surprisingly simple: opinion is ultimately made from attention. In order to form an opinion we need to pay attention to issues, and to questions we are facing as a society. Opinion should not be equated with emotion, even if it certainly also draws on emotion (to which we also pay attention), but also needs reasoned view in order to become opinion. Our opinions change, also through the allocation of attention, when we decide to review the reasons underlying them and the emotions motivating us to hold them.

You could argue that this is a grossly naive and optimistic view of opinion, and that what forms opinion is fear, greed, ignorance and malice – and that opinions are just complex emotions, nothing more, and that they have become even more so in our modern society. That view, however, leads nowhere. The conclusion for someone believing that is to throw themselves exasperated into intellectual and physical exile. I prefer a view that is plausible and also allows for the strengthening of democracy.

A corollary of the abovementioned is that democracy is also made from attention – from the allocated time we set aside to form our opinions and contribute to democracy. I am, of course, referring to an idealized and ideal version of democracy in which citizenship is an accomplishment and a duty rather than a right, and where there is a distinct difference between ”nationality” and ”citizenship”. The great empires of the world seem to always have had a deep understanding of this – Rome safeguarded its citizens and citizenship was earned. In contrast, some observers note that the clearest sign of American decline is that US citizenship is devolving into US nationality. Be that as it may — I think that there is a great deal of truth in the conception of democracy as made of opinion formed by the paying of attention.

This leads to a series of interesting questions about how we pay attention today, and what challenges we face when we pay attention. Let me outline a few, and suggest a few problems that we need to study closer.

First, the attention we have is consumed by the information available. This is an old observation that Herbert Simon made in a 1969 talk that he wrote on information wealth and attention poverty. His answer, then, remarkably, was that we need to invest in artificial intelligence to augment attention and allow for faster learning (we should examine the relationship between learning and democracy as well at some point: one way to think about learning is that it is when we change our opinions) – but more importantly he noted that there is an acute need to allocate attention efficiently. We could build on that and note that at high degrees of efficiency of allocation of attention democratic discourse is impossible.

Second, we have learnt something very important about information in the last twenty years or so, and that is that the non-linear value of information presents some large challenges for us as a society. Information – at an abundance – collapses into noise, and the value then can quickly become negative; we need to sift through the noise to find meaning and that creates filter costs that we have to internalize. There is, almost, a pollution effect here. The production of information by each and everyone of us comes with a negative externality in the form of noise.

Third, the need for filters raises a lot of interesting questions about the design of such filters. The word ”filter” comes with a negative connotation, but here I only mean something that allows us to turn noise into information over which we can effectively allocate attention.

That attention plays a crucial role in the information society is nothing new, as we mentioned, and it has been helpfully emphasized by people like Tim Wu, Tristan Harris and others. There is often an edge in the commentary here that suggests that there is a harvesting of attention and monetization of it, and that this in some way is detrimental. This is worth a separate debate, but let it suffice for now that we acknowledge that this can certainly be the case, but also that the fact that attention can be monetized can be very helpful. In fact, good technology converts attention to money at a higher exchange rate and ensures that the individual reaps the benefits from that by finding what he or she is looking for faster, for example. But again: this is worth a separate discussion – and perhaps this is one where we need to dig deeper into the question of the social value of advertising as such – a much debated issue.

So, where does this land us? It seems that we need to combat distraction and allocate attention effectively. What, then, is distraction?

*

Fake news and disinformation are one form of distraction, and certainly a nefarious one in the sense that such distractions detract from efforts to form opinions in a more serious way in many cases. But there are many other distractions as well. Television, games, gambling and everything else that exists in the leisurespace is in a way a distraction. When Justice Brandeis said that leisure time is the time we need to use to become citizens, he attacked the problem of distraction from a much broader perspective than we sometimes do today. His notion was that when we leave work, we have to devote time to our other roles, and one of the key roles we play is that of the citizen. How many of us devote time every day or week to our citizen role? Is there something we can do there?

*

The tension between distraction and attention forces us to ask a more fundamental question, and that is if the distraction we are consumed by is forced or voluntary. Put in a different way: assume that we are interested in forming an opinion on some matter, can we do so with reasonable effort or are the distractions so detrimental that the formation of informed and reasoned opinion has become impossible?

At some level this is an empirical question. We can try: assume that you are making your mind up on climate change. Can you use the Internet, use search and social networks in order to form a reasoned opinion on whether climate change is anthropogenic? Or are the distractions and the disinformation out there so heavy that it is impossible to form that opinion?

Well, you will rightly note, that will differ from person to person. This is fair, but let’s play with averages: the average citizen who honestly seeks to make up his or her mind – can they on a controversial issue?

A quick search, a look at Wikipedia, a discussions with friends on a social network — could this result in a reasoned opinion? Quite possibly! It seems that anyone who argues that this is impossible today also needs to carry the burden of evidence for that statement. Indeed, it would be extraordinary if we argued that someone who wants to inform themselves no longer can, in the information society.

There are a few caveats to that statement, however. One is about the will itself. How much do we want to form reasoned opinions? This is a question that risks veering into elitism and von oben perspectives (I can already hear the answers along the lines of ”I obviously do, but others…”) so we need to tread carefully. I do think that there are competing scenarios here. Opinions have many uses. We can use them to advance our public debate, but if we are honest a large use case for opinions is the creation of a group and the cohesion of that group. How many of our opinions do we arrive at ourselves, and how many are we accepting as a part of our belonging to a particular group?

Rare is the individual who says that she has arrived, alone, at all of her opinions. Indeed, that would make no sense, as it would violate Simon’s dictum: we need to allocate attention efficiently and we rely on others in a division of attention that is just a mental version of Adam Smith’s division of labor. We should! To arrive at all your own opinions would be so costly that you would have little time to do anything else, especially in a society that is increasingly complex and full of issues. The alternative would be to have very few opinions, and that seems curiously difficult. Not a lot of people offer that they have no opinion on a subject that is brought up in conversation, and indeed it would almost feel asocial to do that!

So group opinions are rational consequences of the allocation of attention, but how do we know if the group arrives at their opinion in a collectively rational way? It depends on the group, and how it operates, obviously, but at the heart of the challenge her is a sense of trust in the judgments of others.

The opinions we hold that are not ours are opinions we hold because we trust the group that arrived at them. Trust matters much more than we may think in the formation of opinion.

*

If distraction is one challenge for democratic societies, misallocation of attention is another. The difference is clear: distraction is when we try to but cannot form an opinion. Misallocation is when we do not want to form a reasoned opinion but are more interested in the construction of an identity or a sense of belonging, and hence want to confirm an opinion that we have adopted for some reason.

The forming and confirming of opinion are very different things. In the first case we shape and form our opinion and it may change over time, in the second we simply confirm an opinion that we hold without examining it at all. It is well known that we are prone to confirmation bias and that we seek information that confirms what we believe to be true, and this tendency is one that sometimes wins over our willingness to explore alternative views. Especially in controversial and emotional issues. That is unfortunate, but the question is what the relationship is there with disinformation?

One answer could be this: the cost of confirmation bias falls when there is a ready provision of counter facts to all facts. Weinberger notes that the old dictum that you are entitled to your opinions, but not your facts, has become unfashionable in the information society since there is no single truth anymore. For every fact there is a counter-fact.

Can we combat this state of affairs? How do we do that? Can we create a respository and a source of facts and truths? How do you construct such an institution?

Most of us naturally think of the Wikipedia when we think of something like that – but there is naturally much in the Wikipedia that is faulty or incorrect, and this is not a dig against the Wikipedia, but simply a consequence of its fantastic inclusion and collaborative nature. Also – we know that facts have a half-life in science, and the idea of uncontrovertible fact is in fact very unhelpful and has historically been used rather by theologians than by democrats. But yet, still, we need some institutional response to the flattening of the truth.

It is not obvious what that would be, but worth thinking about and certainly worth debating.

*

So individual will and institutional truth, ways of spending attention wisely and the sense of citizenship. That is a lot of rather vague hand-waving and sketching, but it is a start. We will return to this question in the course of the year, I am sure. For now, this just serves as a few initial thoughts.

What are we talking about when we talk about algorithmic transparency?

The term ”algorithmic transparency”, with variants and variations, has become more and more common in the many conversations I have with decision makers and policy wonks. It remains somewhat unclear what it actually means, however. As a student of philosophy I find that there is often a lot of value in examining concepts closely in order to understand them, and in the following I wanted to open up a coarse-grained view of this concept in order to understand it further.

At a first glance it is not hard to understand what is meant with algorithmic transparency. Imagine that you have a simple piece of code that manipulates numbers, and that when you enter a series it produces an output that is another series. Say you enter 1, 2, 3, 4 and that the output generated is 1, 4, 9, 16. You have no access to the code, but you can infer that the codde probably takes the input and squares it. You can test this with a hypothesis – you decide to see if entering 5 gives you 25 in response. If it does, you are fairly certain that the code is something like ”take input and print input times input” for the length of the series.

Now, you don’t _know_ that this is the case. You merely believe so and for every new number you enter that seems to confirm the hypothesis your belief may be slightly corroborated (depending on what species of theory of science you subscribe to). If you want to know, really know, you need to have a peek at the code. So you want algorithmic transparency – you want to see and verify the code with your own eyes. Let’s clean this up a bit and we have a first definition.

(i) Algorithmic transparency means having access to the code a computer is running as to have a human be able to verify what it is doing.

So far, so good. What is hard about this, then, you may ask? In principle we should be able to do this with any system and so be able to just verify that it does what it is supposed to and check the code, right? Well, this is where the challenges start coming in.

*

The first challenge is one of complexity. Let’s assume that the system you are studying has a billion lines of code and that to understand what the system does you need to review all of them. Assume, further, that the lines of code refer to each other in different ways and that there are interdependencies and different instantations and so forth – you will then end up with a situation where access to the code is essentially meaningless, because access does not guarantee verifiability or transparency in any meaningful sense.

This is easily realized by simply calculating the time needed to review a billion line piece of software (note that we are assuming her that software is composed of lines of code – not an obvious assumption as we will see later). Say you need one minute to review a line of code – that makes for a billion minutes, and that is a lot. A billion seconds is 31.69 years, so even if you assume that you can verify a line a second the time needed is extraordinary. And remember that we are assuming that _linear verification_ will be exhaustive – a very questionable assumption.
So we seem to have one interesting limitation here, that we should think about.

L1: Complexity limits human verifiability.

This is hardly controversial, but it is important. So we need to amend and change our definition here, and perhaps think about computer-assisted verification. We end up with something like.

(ii) Algorithmic transparency is achieved by access to the code that allows another system to verify the way the system is designed.

There is an obvious problem with this that should not be scooted over. As soon as we start using code to verify code we enter an infinite regress. Using code to verify code means we need to trust the verifying code over the verified. There are ways in which we can be comfortable with that, but it is worth understanding that our verification now is conditional on the verifying code working as intended. This qualifies as another limit.

L2: Computer assisted verification relies on blind trust at some point.

So we are back to blind trust, but the choice we have is what system we have blind trust in. We may trust a system that we have used before, or that we believe we know more about the origins of, but we still need to trust that system, right?

*

So, our notion of algorithmic transparency is turning out to be quite complicated. Now let’s add another complication. In our proto-example of the series, the input and output were quite simple. Now assume that the input consistens of trillions of documents. Let’s remain in our starkly simplified model: how do you know that the system – complex – is doing the right thing given the data?

This highlights another problem. What exactly is it that we are verifying? There needs to be a criterion here that allows us to state that we have achieved algorithmic transparency or not. In our naive example above this seems obvious, since what we are asking about is how the system is working – we are simply guessing at the manipulation of the series in order to arrive at a rule that will allow us to predict what a certain input will yield in terms of an output. Transparency reveals if our inferred rule is the right one and we can then debate if that is the way the rule should look. The value of such algorithmic transparency lies in figuring out if the system is cheating in any way.

Say that we have a game. I say that if you can guess what the next output will be and I show you the series 1, 2, 3, 4, and then the output 1, 4, 9, 16. Now I ask you to bet on what the next number will be as I enter 5. You guess 25 and I enter 5 and the output is 26. I win the bet. You require to see the code and the code says: ”For every input print input times input except if input is 5, then print input times input _plus one_”.

This would be cheating. I wrote the code. I knew it would do that. I put a trap in the code, and you want algorithmic transparency to be able to see that I have not rigged the code to my advantage. I am verifying two things: the rule I have inferred is the right one AND that rule is applied consistently. So it is the working of the system as well as its consistency or its lack of bias in anyway.

Bias or consistency is easy when you are looking at a simple mathematical series, but how do you determine consistency in a system that contains a trillion data points and uses a system of over, say, a billion lines of code? What does consistency mean? Here is another limitation, then.

L3: Algorithmic transparency needs to define criteria for verification such that they are possible to determine with access to the code and data sets.

I suspect this limitation is not trivial.

*

Now, let’s complicate things further. Let’s assume that the code we use generates a network of weights that are applied to decisions in different ways, and that this network is trained by repeated exposure to data and its own simulations. The end result of this process is a weighted network with certain values across it, and perhaps they are even arrived at probabilistically. (This is a very simplified model, extremely so).
Here, by design, I know that the network will look different every time I ”train” it. That is just a function of its probabilistic nature. If we now want to verify this, what we are really looking for is a way to determine a range of possible outcomes that seem reasonable. Determining that will be terribly difficult, naturally, but perhaps it is doable. But at this point we start suspecting that maybe we are engaged with the issue at the wrong level. Maybe we are asking a question that is not meaningful.

*

We need to think about what it is that we want to accomplish here. We want to be able to determine how something works in order to understand if it is rigged in some way. We want to be able explain what a system does, and ensure that what it does is fair, by some notion of fairness.

Our suspicion has been that what we need to do to do this is to verify the code behind the system, but that is turning out to be increasingly difficult. Why is that? Does that mean that we can never explain what these systems do?
Quite the contrary, but we have to choose an explanatory stance – to draw from a notion introduced by DC Dennett. Dennett, loosely, notes that systems can be described in different ways, from different stances. If my car does not start in the morning I can described this problem in a number of different ways.

I can explain it by saying that it dislikes me and is grumpy, assuming an _intentional_ stance, assuming that the system is intentional.
I can explain it by saying I forgot to fill up on gasoline yesterday, and so the tank is empty – this is a _functional_ or mechanical explanation.
I can explain it by saying that the wave functions associated with the care are not collapsing in such a way as to…or use some other _physical_ explanation of the car as a system of atoms or a quantum physical system.

All explanations are possible, but Dennett and others note that we would do well to think about how we choose between the different levels. One possibility is to look at how economical and how predictive an explanation is. While the intentional explanation is shortest, it gives me now way to predict what will allow me to change the system. The mechanical or functional explanation does -and the physical would take pages on pages to do in a detailed manner and so is clearly uneconomical.
Let me suggest something perhaps controversial: the ask for algorithmic transparency is not unlike an attempt at explaining the car’s malfunctioning from a quantum physical stance.
But that just leaves us with the question of how we achieve what arguably is a valuable objective: to ensure that our systems are not cheating in any way.

*

The answer here is not easy, but one way is to focus on function and outcomes. If we can detect strange outcome patterns, we can assume that something is wrong. Let’s take an easy example. Say that an image search for physicist on a search engine leads to a results page that mostly contains white, middle-aged men. We know that there are certainly physicists that are neither male or white, so the outcome is weird. We then need to understand where that weirdness is located. A quick analysis gives us the hypothesis that maybe there is a deep bias in the input data set where we, as a civilization, have actually assumed that a physicist is a white, middle-aged man. By only looking at outcomes we are able to understand if there is bias or not, and then form hypothesis about where that bias is introduced. The hypothesis can then be confirmed or disproven by looking at separate data sources, like searching in a stock photo database or using another search engine. Nowhere do we need to, or would we indeed benefit from, looking at the code. Here is another potential limitation, then.

L4: Algorithmic transparency is far inferior to outcome analysis in all sufficiently complex cases.

Outcome analysis also has the advantage of being openly available to anyone. The outcomes are necessarily transparent and accessible, and we know this from a fair amount of previous cases – just by looking at the outcomes we can have a view on whether a system is inherently biased or not, and if this bias is pernicious or not (remember that we want systems biased against certain categories of content, to take a simple example).

*

So, summing up. As we continue to explore the notion of algorithmic transparency, we need to focus on what it is that we want to achieve. There is probably a set of interesting use cases for algorithmic transparency, and more than anything I imagine that the idea of algorithmic transparency actually is an interesting design tool to use when discussing how we want systems to be biased. Debating, in meta code of some kind, just how bias _should be_ introduced in, say, college admission algorithms, would allow us to understand what designs can accomplish that best. So maybe algorithmic transparency is better for the design than the detection of bias?

A note on complementarity and substitution

One of the things I hear the most in the many conversations I have on tech and society today is that computers will take jobs or that man will be replaced by machine. It is a reasonable and interesting question, but I think, ultimately wrong. I tried to collect a few thoughts about that in a small essay here for reference. The question interests me for several reasons – not least because I think that it is partly a design question rather than something driven by technological determinism. This in itself is a belief that could be challenged on a number of fronts, but I think there is a robust defense for it. The idea that technology has to develop in the direction of substitution is simply not true if we look at all existing systems. Granted: when we can automate not just a task but cognition generally this will be challenged, but strong reasons remain to believe that we will not automate fully. So, more of this later. (Image: Robin Zebrowski)

Autonomy, technology and prediction I: some conceptual remarks

”How would you feel if a computer could predict what you would buy, how you would vote and what kinds of music, literature and food you would prefer with an accuracy that was greater than that of your partner?”

Versions of this question has been thrown at me in different fora over the last couple of months. It contains much to be unpacked, and turns out to be a really interesting entry into a philosophical analysis of autonomy. Here are a few initial thoughts.

  1. We don’t want to be predictable. There is something negative about that quality that is curious to me. While we sometimes praise predictability, we then call it reliability, not predictability. Reliability is a relational concept – we feel we can rely on someone, but predictability is something that has nothing to do with relationships, I think. If you are predictable, you are in some sense a thing, a machine, a simple system. Predictable people lose some of their humanity. Take an example from popular culture – the hosts in Westworld. They are caught in loops that make them easy to predict, and in a key scene Dr Ford expresses his dislike for humanity by saying that the same applies to humans: we are also caught in our loops.
  2. The flip side of that, of course, is that noone would want to be completely unpredictable. Someone who at any point may throw themselves out the window, start screaming, steal a car or disappear into the wilderness to write poetry would also be seen as less than human. Humanity is a concept associated with a mix of predictability and unpredictability. To be human is to occasionally surprise others, but also to be relied upon for some things.
  3. To be predictable is often associated with being easy to manipulate. The connection between the two is not entirely clear cut, since it does not automatically follow from someone being predictable that they can be manipulated.
  4. One way to think about this is to think about the role of predictability in game theory. There are two perspectives here: one is that in order to make credible threats, you need to be predictable in the sense that you will enforce those threats under the circumstances you have defined. There are even techniques for this – you can create punishments for yourself, like the man who reputedly gave his friend 10 000 USD to donate to the US national socialist party (a party the man hated) if his friend ever saw him smoking. Commitment to a cause is nothing else than predictability. Following Schelling, however, a certain unpredictable quality is also helpful in a game, when the rational thing to do is what favors the enemy. One apocryphal anecdote about Herman Kahn, who advocated thermo-nuclear war as a possibility – was that he was paid to do this as to keep the Soviets guessing if the US really could be that crazy to entertain the idea of such a complete war. In games it is the shift between predictability and unpredictability – the bluff! – that matters.
  5. But let’s return to the question. How would we feel? Would it matter how much data the computer needed to make its predictions? Would we feel worse or better if it was easier to predict us? Assume it took only 200 likes from a social network to make these predictions – would that be horrifying or calming to you? The first reaction here may be that we would feel bad if it was in some sense easy to predict us. But let’s consider that: if it took only 200 likes to predict us, the predictions would be thin, and we could change easily. The prediction horizon would be short, and the prediction thin. Let’s pause and examine these concepts, as I think they are important.
  6. A prediction horizon is the length of time for which I can predict something. In predicting the weather, one question is for how long we can predict it – for a day? For a few days? For a year? Anyone able to that – predict the weather accurately for a year – would have accomplished something quite amazing. But predicting the weather tomorrow? You can do that with 50% accuracy by saying that tomorrow will be like today. Inertia helps. The same phenomenon applies to the likes. If you are asked to predict what someone will do tomorrow, looking at what they did today is going to give you a pretty good idea. But it is not going to be a very powerful prediction, and it is not one that in any real sense threatens our autonomy.
  7. A prediction is thin if it concentrates on a few aspects of a predicted system. An example is predicted taste in books or music. Predicting what you will like in a new book or a new piece of music is something that can be done fairly well, but the prediction is thin and does not extend beyond its domain. It tells you nothing about who you will marry or if you will ever run for public office. A thick prediction is cross domains and would enable the predictor to ask a broad set of questions about you that would predict the majority of your actions over the prediction horizon.
  8. There is another concept that we need as well. We need to discuss prediction resolution. The resolution of a prediction is about the granularity of the prediction. There is a difference between predicting that you will like Depeche Mode and predicting that you will like their third album more than the fourth, or that your favorite song will be ”Love in itself”. As resolution goes down, prediction becomes easier and easier. The extreme case is the Keynesian quip: in the long run we are all dead.
  9. So, let’s do back to the question about the data set. It obviously would be different if a small data set allowed for a thick, granular prediction across a long horizon or if that same data set just allowed for a short horizon, thin prediction with low resolution. When someone says that they can predict you, you need to think about which one it is – and then the next question becomes if it is better if you have a large data set that does the same.
  10. Here is a possibility: maybe we can be relaxed about thin predictions over short horizons with low resolution based on small data sets (let’s call these a-predictions), because these will not affect autonomy in any way. But thick predictions over long horizons with high resolution, based on very large data sets are more worrying (let’s call these b-predictions).
  11. Here are a few possible hypotheses about these two classes of predictions.
    1. The possibility of a-predictions does not imply the possibility of b-predictions.
    2. Autonomy is not threatened by a-predictions, but by b-predictions.
    3. The cost of b-predictions is greater than the cost of a-predictions.
    4. Aggregated a-predictions do not become b-predictions.
    5. a-predictions are necessary in a market economy for aggregated classes of customers.
    6. a-predictions are a social good.
    7. a-predictions shared with the predicted actor change the probability of the a-predictions.
  12. There are many more possible hypotheses worth examining and thinking about here, but this suffices for a first exploration.

(image: Mako)