Will robo-journalists take over the industry?

Ainaa Mashrique
Dec 12, 2017
7 min read

Automation has finally infiltrated the journalism industry. The benefits are impressive, but not everyone is happy about it.

“I think the interesting thing is, increasingly everything is becoming automated. Everything is, in a sense, being generated by machines,” Chris Middleton, a UK AI expert/ journalist, told AI Today.

Automation is becoming the new normal. It is everywhere, but when it first cropped up, many argued creative industries will not be affected. But here we are, years down the line, with robots writing actual news stories.

Data journalism has been a very significant trend— with the importance of facts and figures underlined by Joseph Pulitzer himself over a century ago. In a 1904 essay he wrote for the North American Review, Pulitzer outlined the skills he thought journalists would need in order to serve a civic role- among them was educating journalists in statistics.

Joseph J. Pulitzer. Image by: The Cyclopædia of American Biography (1918) via Wikimedia Commons.

What followed suit throughout the next few decades was the implementation of the latest tech to make data journalism easier, which birthed a new trend in journalism: computational journalism. Computers appeared to help journalists do their jobs easier, with more accuracy and scale. In July 1967, Philip Meyer from The Detroit’s Free Press covered an early morning riot in Detroit. Inspired by academics’ use of statistics, he devised surveys with the help of social scientists to investigate public opinion regarding the issue. His “swift and accurate investigation into the underlying causes” won the Pulitzer Prize for Local General Reporting in 1968.

"Swift and accurate" are common words to describe automation. But that is not all— besides generating stories faster, automation can also distribute news rapidly, with a wider reach too. Once pre-programmed, the technology can create new stories and angles, which may vastly increase the amount of news available.

An introduction about automation in journalism. Video by: Ainaa Mashrique. Soundtrack by: Bensound.com. DISCLAIMER: all pictures and clips used within this video belong to their respective owners- CNN, Caspian Report, CNBC, Expedia, The Tonight Show, The Globe and Mail and The Jakarta Post. This is a transformative work, which constitutes "fair-dealing" of copyright material, allowed under section 30 of the Copyright, Design and Patents Act 1998.

Currently, automated journalism only works well for "routine" topics that have clean, accurate and structured data.

In recent years, the use of algorithms to automatically generate news has been news in itself— most especially since one of the world’s largest and most highly regarded news organisation, the Associated Press (AP), has started to automate the production of its quarterly corporate earnings reports. But they were not the first…

Since 2012, Forbes has been cooperating with Narrative Science, one of the major companies providing the technology, to generate company earnings previews. Similar to the experience at the AP, Forbes’ automating has allowed for generating more stories while freeing up resources. As a result of the additional coverage, they now have a bigger audience, and better site traffic— all good things for advertising. It is no surprise that other leading media companies such as The New York Times, Los Angeles Times and ProPublica have jumped on the bandwagon. Reginald Chua, executive editor for editorial operations, data and innovation at Thomson Reuters, told the Tow Centre for Digital Journalism: “You can’t compete if you don’t automate.”

Since automation deals with low-level, routine reporting, journalists are now able to pursue more “important”, in-depth stories. Who likes writing boring, repetitive financial reports anyway?

However, not all journalists are that optimistic. Some argue that the quality of journalism will take a dive, claiming that automated stories lack depth and are “too” technical. It is important to note that, for now, automation is only used for topics with a lot of available data, like sports and finance. And sports and finance stories do not actually require much creativity.

That is not all, though. Local AI expert and journalist, Chris Middleton, highlighted the problem with news automation heavily relying on press releases, telling AI Today: “That worries me because maybe more of journalism will lean towards what press releases say.”

He added that: “It is becoming more marketing based. And some organisations are investing in being totally automated, somebody puts up a press release and it turns into a news story. And possibly no human being was involved in that. That’s a worrying process for people who believe in journalism such as myself— we believe that you should go out and find stories, you know, write them yourselves because that’s the whole point.”

The debate is bound to be more interesting, as bots start to cover more challenging subjects— like political and social issues— in the future.

Software

This is all interesting, but how does automation take place? Natural Language Generation (NLG) is a developing subfield of AI, and it is employed in many different industries— including journalism. The main ones that are trending are Automated Insights’ Wordsmith, Narrative Science’s Quill, and Arria.

Just like most computational processes, an NLG system can be broken down into different steps such as Text planning, Sentence planning, and Realisation, which all work to form and group sentences with the right words, structure, grammar and style.

To put it simply, after the software collects any available data, algorithms use statistical methods to identify any interesting highlights in the data set: interesting events, such as a sharp increase or decline in the numbers. After that, the software classifies the information by importance. Consecutively, the software produces and arranges the sentences, as instructed by programmers, to generate a narrative. Finally, the story can be uploaded to the publisher’s content management system, which could publish it automatically.

The software would still need to rely on a set of rules that are specific to the area it is being used in, usually decided by a team of engineers, journalists and computer linguists.

NLG includes complex computer coding and programming. But sometimes, it can be simple enough for resident staff members in a news organisation to create it by themselves.

Man coding a program on his laptop. Image by: Almonroth via Wikimedia Commons.

The LA Times started using its own automation program, first coded by Ken Schwencke, in a project called The Homicide Report, which aims to report every homicide case in the L.A. County. The program produces short news snippets, while the crime reporters expand on the stories, giving them human- interest angles.

Ethical Issues

Bias is one of the factors that automation advocates often bring up, saying that algorithms deal with numbers, facts and figures— free from biases and errors. Of course, the underlying assumption is that the data is accurate and the algorithms are programmed responsibly.

But is it always more unbiased than humans? After all, humans do the programming…

Recent research has revealed that AI programs show racial and gender biases. Joanna Bryson, a computer scientist at the University of Bath, and one of the authors of the report, told The Guardian: “A lot of people are saying this is showing that AI is prejudiced. No. This is showing we’re prejudiced and that AI is learning it.”

The research focuses on a machine learning tool known as “word embedding”, which is already changing the way computers interpret speech and text. But the latest paper illustrated a divisive problem: algorithms tend to associate the words “female” and “woman” with arts and humanities occupations and with the home, while “male” and “man” were linked to mathematics and engineering professions. And it was more likely to associate African American names with less desirable adjectives and words.

And in a critical report by Laurie Penny in The Guardian, Lydia Nicholas, senior researcher at the innovation think-tank Nesta, explains that all this data was documented in the past, and therefore, reflects values of the past.

Having to depend on out-dated data, which may still reflect divisive biases, is quite an ironic limitation for such a futuristic innovation.

Nonetheless, Sandra Wachter, a researcher in data ethics and algorithms at the University of Oxford, told the Guardian: “We can, in principle, build systems that detect biased decision-making, and then act on it.” She, alongside other experts and researchers, believes that there needs to be an "AI watchdog", to counteract any unintentional AI discrimination.

The possible future

If researchers find a way to avoid AI prejudice, and automation gets adopted in most if not all newsrooms, what would happen then?

We have already established that automated journalism may substantially increase the amount of available news. This would inevitably cause a chain reaction: it could make it more difficult for consumers to find content most relevant to them. This is either a luxury, or a problem, depending on the point of view, but search engines and personalised news aggregators, such as Google News, would then become significantly more important.

Search engine providers claim to analyse individual user data to provide news consumers with the most relevant content. Therefore, different news consumers might receive different results for the same keyword searches. This may lead to partial information blindness, or also known as the “filter bubble” hypothesis, where individuals would constantly consume the same kind of information.

Consequently, they would experience less “cognitive dissonance”, which means that it would be less likely for people to find information that challenges their views or contradicts their interests— this is crucial for public opinion in a democratic society.

Despite the theory’s mass appeal, there is no empirical evidence that proves the existence of the filter bubble. However, as automated news content and news availability increases, the need for content personalisation may very well increase as well, making the “filter bubble” a working theory.

At the moment, there are not many media organisations utilising automation and AI. Few software providers offer actual journalistic products, and the products that are available still have their limitations; they still need structured data and human pre-programming to work.

Therefore, journalists can sit back, relax and enjoy a cup of coffee… for now, at least.

According to Chris Middleton, there are still many organisations who will make “an absolute commitment to quality human journalism”, but some of these organisations are “using AI to bring people towards related content, other content that they might like”.

There is still a long way to go for automation technology. Image by: KUKA via Wikimedia Commons.

For now, according to a Tow report last year, it seems that automation in journalism is likely here to stay. They advised journalists to learn and practice skills that robots would never be able to compete with (for the time being), like giving an article “the human touch” with a variety of story-telling techniques.

Automated journalism is still in its early experimental phases. But considering its cost-effective benefits, this may drastically change, and it may not be as far ahead in the future as one might think.