The Associated Press and other news organizations have been using robots for several years to produce routine articles on finance and sports. Long before that, automated weather reports were being used by news agencies.
All three of these types of reporting rely on structured databases with ample historical information that can be transformed into text. It closely resembles writing by humans, except perhaps it can be drier and more technical.
The good news is that automating some forms of journalism production reduces errors and expands coverage. The reading public and even journalists themselves can’t always tell the difference between articles produced by humans and robots.
What this will mean for the future of the profession and journalists themselves is a developing story. In 2016, researcher Andreas Graefe produced Guide to Automated Journalism for the Tow Center for Digital Journalism at Columbia University. He looked at the potential of automation and its limitations for news consumers, news organizations, journalists, and society as a whole.
Last year, Jason Whittaker of the University of Lincoln in the UK published a more comprehensive study–Tech Giants, Artificial Intelligence, and the Future of Journalism–that examined automated journalism in the context of the new information gate-keepers and the dangers of the way they filter news for us.
Just to be clear on our terms, when we talk about “robots” we are describing automated journalism produced by software. The software is programmed by algorithms familiar to journalists: put the most important information in the first paragraph, and describe the who, what, when, where, and how much. If the program is more sophisticated, it might explain some of the why and how.
A robot produces an earnings report
Here is an example of a robot-written story produced last week by the Associated Press for Yahoo News:
CHICAGO (AP) _ Gogo Inc. (GOGO) on Friday reported a loss of $22.4 million in its fourth quarter. On a per-share basis, the Chicago-based company said it had a loss of 28 cents.
The results topped Wall Street expectations. The average estimate of six analysts surveyed by Zacks Investment Research was for a loss of 49 cents per share.
The in-flight internet provider posted revenue of $221.3 million in the period, also beating Street forecasts. Seven analysts surveyed by Zacks expected $207 million.
For the year, the company reported that its loss narrowed to $146 million, or $1.81 per share. Revenue was reported as $835.7 million. The company’s shares closed at $1.93. A year ago, they were trading at $4.78.
This story was generated by Automated Insights (http://automatedinsights.com/ap) using data from Zacks Investment Research. Access a Zacks stock report on GOGO at https://www.zacks.com/ap/GOGO
The last paragraph above is supposed to inform readers that the article was produced by a machine. But a casual reader might not necessarily conclude that from the technical language used. And they might not care as long as they receive the desired data in a timely fashion.
Fewer errors, more productivity
Graefe describes several experiments that showed news consumers and journalists could not reliably identify whether articles about sports and finance were produced by a human or by a machine (Graefe, pp. 32-33). Automation in these two subject areas is only possible because “clean, accurate, structured data” is available and has been maintained over many years.
The Associated Press has said that its automated earnings reports had an error rate of only 1% compared with 7% for those produced by humans, mostly be eliminating typos and transposed digits (Graefe, pp. 18-19). In addition, the AP said it was able to cover 12 times more companies this way (Whittaker, p. 111).
Still, the old computer principle of “garbage in, garbage out” holds true. If the database provides faulty information, the resulting articles produced from that database will be faulty as well. Graefe describes an automated article on Netflix in 2015 that inaccurately reported the company’s earnings missed forecasts and the share price had fallen by 71%. The database had not been updated to reflect a 7-1 stock split. It was later corrected (p. 18).
Consumers tend to find the automated product dry and boring. But for users who simply want the information fast and understandable, automated journalism may satisfy their basic need.
For journalists, having machines do routine work frees them up to add value to news reports. They can use their human skills such as providing analysis and context. That should be good news for journalists and humans generally. Still, some worry that they might be replaced. Cost-cutting has been a pattern in newsrooms for the past two decades.
Can local news be automated?
Whittaker describes an Associated Press pilot project for automating local news that began in the UK and Ireland in 2017. It was called RADAR, for Reporters and Data and Robots. Fourteen local publishing groups with 20 titles participated (pp. 113-116).
The articles drew heavily on public databases of trends in birth registrations, business layoffs, life chances for disadvantaged children, and local traffic data, among others.
Here is an example of one of the stories produced by a local reporter with the aid of databases :
Headline: Seven potentially life-saving operations were cancelled in Croydon in October
Latest health data have revealed that the body which runs Croydon’s hospitals was one of 40 trusts in England to cancel at least one important procedure.
Croydon Health Services NHS Trust cancelled seven urgent and poten- tially life-saving operations in October, the latest health data have revealed.
It was one of only 40 hospital trusts in England to cancel at least one im- portant procedure in October, according to statistics from the NHS. More than two-thirds of the country’s trusts did not rearrange a single urgent operation over the same time period.
Such operations can include swift action needed to save patients’ lives, limbs, and organs.
In the last 12 months, the trust, which runs Croydon University Hospital, has stopped 69 key surgeries.
The one missing element from this story is any comment from the local health and hospital officials.
Whittaker concludes, “In the immediate future, it is precisely the inability of software to interview respondents that marks the primary difference between local reporting via robot journalism and that conducted by people. But in terms of reportage style, it is clear that the algorithm of news can be repeated very effectively by software” (p. 115).
The future for humans in journalism
Both Graefe and Whittaker worry that the creators of the algorithms for news can build in biases that they are unaware of. The result would be a disservice to society. Thus a key role in automated journalism will be the editors who train the algorithms and test them for bias.
Whittaker is concerned about the role of algorithmic gatekeepers and how they will affect the information available to democratic societies. Governments and businesses are pushing the technology platforms such as Google and Facebook to eliminate or flag blatantly false content. But that is not easy to do.
What’s true and what’s false
As Claire Wardle has pointed out in First Draft, there are at least seven different types of misinformation and disinformation produced for various motives. How can the programmers of the algorithms sort out the differences when human beings have trouble making those judgments?
Should people who believe the Earth is flat or that vaccines poison children or that the US faked its moon landings be allowed to publish on the web? Or should that content be suppressed?
Some information is innocuous but some is more harmful or dangerous. An internet troll could easily sabotage a business or a hospital, for example, by spreading false images or information about supposed danger in its products and services. Politicians of various stripes are using the coronavirus crisis to polarize the public conversation about proper safety strategies.
Filters, echo chambers, and gatekeepers
From a business perspective, the growing concentration of gate-keeping power in the hands of Microsoft, Amazon, Google, Facebook, Comcast, Spotify, Disney, Netflix, and Apple, among others, could dictate the winners and losers in the global economy. They already do.
Eight years ago the website Frugal Dad produced a series of provocative graphics that purported to show that six companies controlled 90% of the media consumed in the US. Since then there has been considerable consolidation.
More recently, Recode’s Peter Kafka updated his ongoing coverage of which media conglomerates own which properties:
This dominance spills over into the rest of the global economy. Here in Europe, the music, TV, and movies from the US are a significant and sometimes dominant presence. And the European Commission and regulators see these media channels as a force that needs to be brought under control.
Anti-monopoly laws in Europe and the US differ significantly. In general, US regulators tolerate a monopoly enterprise if its consumers benefit with lower prices and better service. So Amazon, which fits the definition of a monopoly in books and other retail products, escapes most regulation.
However in Europe, the duopoly of Facebook and Google in digital advertising, and the tax avoidance strategies of Apple and others, have resulted in billions of dollars in fines.
Given the avalanche of news, information, and entertainment on the Internet, these tech giants provide a useful service. They filter the flow. It’s whey we use them. They recommend content they think we will like, based on our behavior and demographic profile.
So the gatekeepers filter what we consume in order to have us consume more of their content. Consequently, we get trapped in a bubble of our own habits. The tool is artificial intelligence, with the accent on artificial. To paraphrase an old newspaper slogan, it’s all the news that fits our biases.
More routine journalism is being done by robots; is that good?