Algorithms can determine whether you get a loan, predict what diseases you might get and even assess how long you might live. It’s kind of important we can trust them!
David Spiegelhalter is the Winton Professor for the Public Understanding of Risk in the Statistical Laboratory, Centre for Mathematical Sciences at the University of Cambridge. As part of the Cambridge Science Festival he was talking (21st of March 2019) on the subject of making algorithms trustworthy.
I’ve heard David speak on many occasions and he is always informative and entertaining. This was no exception.
Algorithms now regularly advise on book and film recommendations. They work out the routes on your satnav. They control how much you pay for a plane ticket and, annoyingly, they show you advertisements that seem to know far too much about you.
But more importantly they can affect life and death situations. The results of an algorithmic assessment of what disease you might have could be highly influential, affecting your treatment, your well-being and on your future behaviour.
David is a fan of Onora O’Neill who suggests that organisations should not be aiming to increase trust but should aim to demonstrate trustworthiness. False claims about the accuracy of algorithms are as bad as defects in the algorithms themselves.
The pharmaceutical industry has long used a phased approach assessing the effectiveness, safety and side-effects of drugs. This includes the use of randomly controlled trials, and long-term surveillance after a drug comes onto the market, to spot rare side-effects.
The same sorts of procedures should be applied to algorithms. However, currently only the first phase of testing on new data is common. Sometimes algorithms are tested against the decisions that human experts make. Rarely will randomly controlled trials be conducted, or the algorithm in use be subject to long-term monitoring.
Algorithms should be transparent. They should be able to explain their decisions as well as to provide them. But transparency is not enough. O’Neill uses the term 'intelligent openness’ to describe what is required. Explanations need to be accessible, intelligible, usable, and assessable.
Algorithms need to be both globally and locally explainable. Global explainability relates to the validity of the algorithm in general, while local explainability relates to how the algorithm arrived at a particular decision. One important way of being able to test an algorithm, even when it’s a black box, is to be able to play with inputting different parameters and seeing the result.
Deep Mind (owned by Google) is looking at how explanations can be generated from intermediate stages of the operation of machine learning algorithms.
Explanation can be provided at many levels. At the top level this might be a simple verbal summary. At the next level it might be having access to a range of graphical and numerical representations with the ability to run 'what if' queries. At a deeper level, text and tables might show the procedures that the algorithm used. Deeper still, would be the mathematics underlying the algorithm. Lastly, the code that runs the algorithm should be inspectable. I would say that a good explanation is dependent on understanding what the user wants to know - in other words, it is not just a function of the decision making process but also a function of the user’s actual and desired state of knowledge.
Without these types of explanation, algorithms such as the one used by the US company Compas to predict rates of recidivism, are difficult to trust.
It is easy to feel that an algorithm is unfair or can’t be trusted. If it cannot provide sufficiently good explanations, and claims about it are not scientifically substantiated, then it is right to be sceptical about its decisions.
Most of David’s points apply more broadly than to artificial intelligence and robots. They are general principles applying to the transparency, accountabilityand user acceptance of any system. Trust and trustworthiness are everything.
See more of David’s work on his personal webpage at http://www.statslab.cam.ac.uk/Dept/People/Spiegelhalter/davids.html , . And read his new book “The Art of Statistics: Learning from Data”, available shortly.
Cambridge (UK) is awash with talks at the moment, and many of these are about artificial intelligence. On Tuesday (12th of March 2019) I went to a talk, as part of Cambridge University’s science festival, by José Hernández-Orallo (Universitat Politècnica de València), titled Natural or 'Artificial Intelligence? Measures, Maps and Taxonomies'.
José opened by pointing out that artificial intelligence was not a subset of human intelligence. Rather, it overlaps with it. After all, some artificial intelligence already far exceeds human intelligence in narrow domains such as playing games (Go, Chess etc.) and some identification tasks (e.g. face recognition). But, of course, human intelligence far outstrips artificial intelligence in its breadth and the amount of training needed to learn concepts.
José‘s main message was how, when it comes to understanding artificial intelligence, we (like the political scene in Britain at the moment) are in uncharted territory. We have no measures by which we can compare artificial and human intelligence or to determine the pace of progress in artificial intelligence. We have no maps that enable us to navigate around the space of artificial intelligence offerings (for example, which offerings might be ethical and which might be potentially harmful). And lastly, we have no taxonomies to classify approaches or examples of artificial intelligence.
Whilst there are many competitions and benchmarks for particular artificial intelligence tasks (such as answering quiz questions or more generally reinforcement learning), there is no overall, widely used classification scheme.
My own take on this is to suggest a number of approaches that might be considered. Coming from a psychology and psychometric testing background, I am aware of the huge number of psychological testing instruments for both intelligence and many other psychological traits. See for example, Wikipedia or the British Psychological Society list of test publishers. What is interesting is that, I would guess, most software applications that claim to use artificial intelligence would fail miserably on human intelligence tests, especially tests of emotional and social intelligence. At the same time they might score at superhuman levels with respect to some very narrow capabilities. This illustrates just how far away we are from the idea of the singularity - the point at which artificial intelligence might overtake human intelligence.
Another take on this would be to look at skills. Interestingly, systems like the Amazon's Alexa describe the applications or modules that developers offer as 'skills'. So for example, a skill might be to book a hotel or to select a particular genre of music. This approach defines intelligence as the ability to effectively perform some task. However, by any standard, the skill offered by a typical Alexa 'skill', Google Home or Siri interaction is laughably unintelligent. The artificial intelligence is all in the speech recognition, and to some extent the speech production side. Very little of it is concerned with the domain knowledge. Even so, a skills based approach to measurement, mapping and taxonomy might be a useful way forward.
When it comes to Ethics, There are also some pointers to useful measures, maps and taxonomies. For example the blog post describing Josephine Young’s work identifies a number of themes in AI and data ethics. Also, the video featuring Dr Michael Wilby on the http://www.robotethics.co.uk/robot-ethics-video-links/ page starts with a taxonomy of ethics and then maps artificial intelligence into this framework.
But, overall, I would agree with José that there is not a great deal of work in this important area and that it is ripe for further research. If you are aware of any relevant research then please get in touch.
John Wyatt is a doctor, author and research scientist. His concern is the ethical challenges that arise with technologies like artificial intelligence and robotics. On Tuesday this week (11th March 2019) he gave a talk called ‘What does it mean to be human?’ at the Wesley Methodist Church in Cambridge.
To a packed audience, he pointed out how interactions with artificial intelligence and robots will never be the same as the type of ‘I – you’ relationships that occur between people. He emphasised the important distinction between ‘beings that are born’ and ‘beings that are made’ and how this distinction will become increasingly blurred as our interactions with artificial intelligence become commonplace. We must be ever vigilant against the use of technology to dehumanise and manipulate.
I can see where this is going. The tendency for people to anthropomorphise is remarkably strong - ‘the computer won’t let me do that’, ‘the car has decided not to start this morning’. Research shows that we can even attribute intentions to animated geometrical shapes ‘chasing’ each other around a computer screen, let alone cartoons. Just how difficult is it going to be to not attribute the ‘human condition’ to a chatbot with an indistinguishably human voice or a realistically human robot. Children are already being taught to say ‘please’ and ‘thank you’ to devices like Alexa, Siri and Google Home – maybe a good thing in some ways, but …
One message I took away from this talk was a suggestion for a number of new human rights in this technological age. These are: (1) The right to cognitive liberty (to think whatever you want), (2) The right to mental privacy (without others knowing) (3) The right to mental integrity and (4) The right to psychological continuity - the last two concerning the preservation of ‘self’ and ‘identity’.
A second message was to consider which country was most likely to make advances in the ethics of artificial intelligence and robotics. His conclusion – the UK. That reassures me that I’m in the right place.
See more of John’s work, such as his essay ‘God, neuroscience and human identity’ at his website johnwyatt.com
Writing about ethics in artificial intelligence and robotics can sometimes seem like it’s all doom and gloom. My last post for example covered two talks in Cambridge – one mentioning satellite monitoring and swarms of drones and the other going more deeply into surveillance capitalism where big companies (you know who) collect data about you and sell it on the behavioural futures market.
So it was really refreshing to go to a talk by Dr Danielle Belgrave at Microsoft Research in Cambridge last week that reflected a much more positive side to artificial intelligence ethics. Danielle has spent the last 11 years researching the application of probabilistic modelling to the medical condition of asthma. Using statistical techniques and machine learning approaches she has been able to differentiate between five more or less distinct conditions that are all labelled asthma. Just as with cancer there may be a whole host of underlying conditions that are all given the same name but may in fact have different underlying causes and environmental triggers.
This is important because treating a set of conditions that may have family resemblance (as Wittgenstein would have put it) with the same intervention(s) might work in some cases, not work in others and actually do harm to some people. Where this is leading, is towards personalised medicine, where each individual and their circumstances are treated as unique. This, in turn, potentially leads to the design of a uniquely configured set of interventions optimised for that individual.
The statistical techniques that Danielle uses, attempt to identify the underlying endotypes (sub-types of a condition) from set of phenotypes (the observable characteristics of an individual). Some conditions may manifest in very similar sets of symptoms while in fact they arise from quite different functional mechanisms.
Appearances can be deceptive and while two things can easily look the same, underneath they may in fact be quite different beasts. Labelling the appearance rather than the underlying mechanism can be misleading because it inclines us to assume that beast 1 and beast 2 are related when, in fact the only thing they have in common is how they appear.
It seems likely that for many years we have been administering some drugs thinking we are treating beast 1 when in fact some patients have beast 2, and that sometimes this does more harm than good. This view is supported by the common practice that getting the medication right in asthma, cancer, mental illness and many other conditions, is to try a few things until you find something that works.
But in the same way that, for example, it may be difficult to identify a person’s underlying intentions from the many things that they say (oops, perhaps I am deviating into politics here!), inferring underlying medical conditions from symptoms is not easy. In both cases you are trying to infer something that may be unobservable, complex and changing, from the things you can readily perceive.
We have come so far in just a few years. It was not long ago that some medical interventions were based on myth, guesswork and the unquestioned habits of deeply ingrained practices. We are currently in a time when, through the use of randomly controlled trials, interventions approved for use are at least effective ‘on average’, so to speak. That is, if you apply them to large populations there is significant net benefit, and any obvious harms are known about and mitigated by identifying them as side-effects. We are about to enter an era where it becomes commonplace to personalise medicine to targeted sub-groups and individuals.
It’s not yet routine and easy, but with dedication, skill and persistence together with advances in statistical techniques and machine learning, all this is becoming possible. We must thank people like Dr Danielle Belgrave who have devoted their careers to making this progress. I think most people would agree that teasing out the distinction between appearance and underlying mechanisms is both a generic and an uncontroversially ethical application of artificial intelligence.