View on desktop for better visualization of charts and data tables.
The New York Times is one of the leading newspapers of the world. It is a part of what is called the "agenda-setting media" and has been leading the American public as well as policy discourse for a very long time. However, when it comes to Middle East, South Asia, or much of the global south, the NYT's coverage seems either lacking in depth or too agenda driven. Apart from arguing with anecdotal evidence or conjecture we have not been able to say much about that. We do not yet have tools at our disposal that can accurately predict media bias - intentional or unintentional - or scientifically prove agenda-driven reporting even with multitude of language processing and machine learning tools available at our disposal today.
However, there are a few tools at our disposal that can give us some quantitative data related to any given text. While methods and tools used here may not be 100% accurate, they do give us some idea about the texts analyzed.
For current analysis, we only used NYT articles related to Pakistan. NYT publishes dozens of articles daily, so to narrow down to only relevant articles we chose the ones that contained the word "Pakistan" in the headline and were published between 1st January 2018 and 31st December 2018.
In 2018 there were 118 articles on NYT that contained the word 'Pakistan' in the headline, excluding non-text audio or video based content that we could not analyze.
Sentiment Analysis is quantitatively identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral. A negative (-ve) score means a negative attitude, positive score (+ve) denotes a positive attitude. Sentiment analysis scores vary from -1 to +1, -1 being the most negative and +1 being the most positive.
For sentiment analysis we used VADER (Valence Aware Dictionary and sEntiment Reasoner) [link]. VADER has been tested on social media texts as well as New York Times editorials, hence most relevant tool for our analysis. We ran a sentiment analysis on each sentence contained within the articles.
Since each sentence had a different sentiment score, we took the mean sentiment score of all sentences to calculate the overall sentiment of an article.
The results of sentiment analysis of all 118 articles under discussion is shown in the chart below.
We can see that out of the 118 articles, most of the articles have a negative sentiment and the chart is heavily skewed towards the left. Hence, we can say with some confidence that overall sentiment of NYT's Pakistan coverage has been negative.
One can argue that if most of the events in Pakistan in 2018 have been negative, NYT cannot give a positive spin to those just to give a positive sentiment. This is a valid argument if we only consider sentiment score, so we also need to check if the articles have been significantly objective.
To check the objectivity of given articles, we used another textual data processing tool called TextBlob. TextBlob provides Subjectivity scores for each given sentence. Just like sentiment analysis, we ran subjectivity analysis on each sentence of every article. The total subjectivity score for an article was calculated by taking mean subjectivity scores of all sentences in that article. Subjectivity score range between 0 to 1, 0 being the most objective and 1 being completely subjective.
In the chart below we have plotted each article on x-axis according to its sentiment score and on y-axis according to the subjectivity score.
Subjectivity scores above 0.25 being considered somewhat subjective, we can see in the chart above that most of the articles analyzed have been rated subjective.
The same chart is given below with an added dimension of impact of the stories in terms of traction on social media, calculated in terms of the number of tweets and retweets.
We can see that most impactful stories by NYT have been somewhat subjective and that majority of them lie on the negative side of sentiment scores.
In terms of topics covered, NYT's coverage of Pakistan in 2018 has not been very diverse and has focused on the usual stories about politics and terrorism, the classic American lens to view Pakistan through.
In above chart we can see that most of the keywords that got the highest number of mentions in all the stories under discussion have been "Khan" and "Sharif", reciving 334 and 244 mentions respectively, the full name "Nawaz Sharif" was mentioned 58 times and "Imran Khan" 57 times. China was mentioned 150 times and Taliban 115 times. Similarly, the rest of keywords can tell us a story about the topics covered by NYT in the past year that include Aasiya Bibi, Belt and Road, CPEC, Afghanistan among others.
The average score for all articles can give us an overall picture of New York Times converage of Pakistan in the past year. The overall sentiment score for all articles is -0.0876, showing that most of NYT's coverage about the country has been negative. Coupled with a subjectivity score of 0.2866 we can conclude that not only has it been negative but has also been highly subjective.
NYT got 53,559 retweets for all articles, along with 99,572 likes and 11,068 replies to the tweets, showing that NYT generated a lot of interest and debate on articles that remained largely subjective and negative.
Note: Stories that included Pakistan in headline included articles from Daily Brief/Top Stories section of NYT, for those articles, only the portion relevant to Pakistan has been analyzed.
Disclaimer: All data has been analyzed using 3rd party libraries and may contain inaccuracies. Opinions expressed by the author do not represent the opinions of the publication.