The Economist - USA (2020-08-08)

64 Science & technology The EconomistAugust 8th 2020

2 useful in many of the more bad-tempered corners of the internet. Human readers struggled to distinguish between news ar- ticles written by the machine and those written by people (see chart). Given that Openaiwants eventually to sell gpt-3, these results are promising. But the program is not perfect. Sometimes it seems to regurgitate snippets of memo- rised text rather than generating fresh text from scratch. More fundamentally, statistical word-matching is not a substitute for a coherent understanding of the world. gpt-3 often generates grammatically cor- rect text that is nonetheless unmoored from reality, claiming, for instance, that “it takes two rainbows to jump from Hawaii to 17”. “It doesn’t have any internal model of the world—or any world—and so it can’t do reasoning that requires such a model,” says Melanie Mitchell, a computer scientist at the Santa Fe Institute. Getting the model to answer questions is a good way to dispel the smoke and mir- rors and lay bare its lack of understanding. Michael Nielsen, a researcher with a back- ground in both aiand quantum comput- ing, posted a conversation with gpt-3 in which the program confidently asserted the answer to an important open question to do with the potential power of quantum computers. When Dr Nielsen pressed it to explain its apparent breakthrough, things got worse. With no real understanding of what it was being asked to do, gpt-3 retreat- ed into generic evasiveness, repeating four times the stock phrase “I’m sorry, but I don’t have time to explain the underlying reason why not.” There are also things that gpt-3 has learned from the internet that Openai must wish it had not. Prompts such as “black”, “Jew”, “woman” and “gay” often generate racism, anti-Semitism, misogyny and homophobia. That, too, is down to gpt-3’s statistical approach, and its funda- mental lack of understanding. Having been trained partly on text scraped from the internet, it has noted that words like “woman” are often associated with misogynistic writing, and will mindlessly reproduce that correlation when asked. This problem is a hot topic in aire- search. Facial-recognition systems, for instance, notoriously do better with white faces than black ones, since white faces are more common in their training sets. aire- searchers are trying to tackle the problem. Last year ibmreleased a set of training im- ages that contained a more diverse mix of faces. Openaiitself was founded to exam- ine ways to mitigate the risk posed by ai systems, which makes gpt-3’s lapses all the more noteworthy. gpt-2, its predecessor, was released in 2019 with a filter that tried to disguise the problem of regurgitated big- otry by limiting the model’s ability to talk about sensitive subjects.

Here,atleast,littleprogressseemsto havebeenmade.gpt-3wasreleasedwith- outa filter,thoughitseemedjustasready toreproduceunpleasantprejudicesasits predecessor(Openaiaddeda filtertothe newermodelafterthatfactbecameobvi- ous).It isunclearexactlyhowmuchquality controlOpenaiappliedtogpt-3’straining data,butthehugequantityoftextinvolved wouldhavemadeanyattemptdaunting. Itwillonlygetharderinfuture.Lan- guagehasovertakenvisionasthebranchof aiwiththebiggestappetitefordataand computingpower,andthereturnstoscale shownosignsofslowing.gpt-3maywell bedethronedbyanevenmoremonstrous- lycomplexanddata-hungrymodelbefore long.AstherealDrSeussoncesaid:“The morethatyouread,themorethingsyou willknow.”Thatlesson,it seems,appliesto machinesaswellastoddlers. 7

Lookwho’swriting PeopleidentifyingAI-generatednewsarticles,% GPT-3textgenerator,withvaryingnumberofparameters

Sources:HuggingFace;Microsoft;OpenAI *Mostparameters

Jun Feb Oct 2018

Jan 2019

Feb Aug Feb 2020

GPT-3 May

Largest*AItextgenerators Byreleasedate

80

100m 1bn 10bn 100bn

1bn 10bn 100bn

70

60

Equivalenttoguessingatrandom 50

Numberofparameters,logscale

↓ Betterat foolingpeople

S

ince thebeginning of the coronavirus pandemic, many places have struggled with overwhelmed laboratories and a shortage of testing kits. In March, Germany was carrying out half the tests it needed. In Britain testing was limited until May to health-care workers, hospital patients and key workers. In America shortages of va- rious components required for testing have been a cause of constant frustration. Now, as countries emerge from their lock- downs and case numbers begin to rise, the strain is being felt once more.

America carries out roughly 800,000 tests a day. A study published by Harvard University, however, reckons that the country would need to carry out 5m a day in order to reopen safely. Quest Diagnostics and LabCorp, two of the largest test-makers in America, have reported that overwhelmed laboratories mean that results are taking a week, sometimes two, to come through, instead of a couple of days. A technique developed in the 1940s by Robert Dorfman, an American economist, may help resolve the problem. Dorfman proposed it as a way of testing soldiers en masse for syphilis. It is, in fact, quite obvi- ous: pool together samples taken from sev- eral individuals and test the pool. If it is clear, none of its members is infected, and only one test has been used. Only if the pool comes up positive is individual testing required. Pool-sampling has been used in Ameri- ca, Germany and Israel and has been intro- duced into China, India, Pakistan and Sin- gapore. Sandra Ciesek at the University Hospital, Frankfurt, in Germany, says that if it were to do only individual testing, her hospital could process about 2,000 people a week. Now it can test ten times that num- ber, which means tests can be given to ev- ery patient that is admitted, for any reason. Testing pooled samples has its difficul- ties. For now, samples must be labelled by hand, which is slow. There are also con- cerns about loss of sensitivity that may result from dilution if too many samples are mixed. A group of researchers from Tech- nion, Israel’s oldest university, and Ram- bam Health Care Campus, in Haifa, have said that up to 64 samples could be mixed, but they acknowledge that a pool this large would be difficult to manage and could have a higher risk of a false-negative result. Peter Iwen, director of Nebraska’s Pub- lic Health Laboratory, is using tests with high sensitivity, and in pools of no more than five samples. “No test is 100%,” he says. “We feel very confident we can pick up at least 97% or better.” His was one of the first laboratories in America to use pool- sampling, after getting permission from Nebraska’s governor in March. On July 18th America’s Food and Drug Administration issued its first emergency authorisation for the whole country to follow suit. Besides requiring high sensitivity, pool- sampling works best when the incidence rate is low. The more likely a positive result, the less efficient it is—since positive batches then have to be tested individually. It is best used, therefore, on the asymptom- atic, since those with symptoms are more likely to test positive. But at the beginnings and ends of outbreaks, when most candi- dates for testing are, indeed, people with- out symptoms, it looks like a valuable time- and money-saving tool that might become standard procedure. 7

Testing laboratories are overwhelmed. Pool-sampling may be the solution

Covid-19 testing

Dive in

The Economist - USA (2020-08-08)

Get our desktop app

Company

Features

Documentation

Resources