HCIL-2017-03

Buntain, C., Golbeck, J.
HCIL-2017-03
Evaluating information accuracy in social media is an increasingly important and well-studied area, but limited research has compared journalist-sourced accuracy assessments to their crowdsourced counterparts. This paper demonstrates the differences between these two populations by comparing the features used to predict accuracy assessments in two Twitter data sets: CREDBANK and PHEME. While our findings are consistent with existing results on feature importance, we develop models that outperform past research. We also show limited overlap exists between the features used by journalists and crowdsourced assessors, and the resulting models poorly predict each other but produce statistically correlated results. This correlation suggests crowdsourced workers are assessing a different aspect of these stories than their journalist counterparts, but these two aspects are linked in a significant way. These differences may be explained by contrasting factual with perceived accuracy as assessed by expert journalists and non-experts respectively. Following this outcome, we also show preliminary results that models trained from crowdsourced workers outperform journalist-trained models in identifying highly shared "fake news" stories.
Return to Main TRs Page