Caspar: Extracting and Synthesizing User Stories of Problems from App Reviews
Hui Guo (hguo5@ncsu.edu) and Munindar P. Singh (mpsingh@ncsu.edu )

A framework for extracting and synthesizing user-app interaction stories in app reviews
Including a first step toward event inference in event pairs

Extracting and Synthesizing User Stories of Problems from App Reviews

Abstract: A user's review of an app often describes the user's interactions with the app. These interactions, which we interpret as mini stories, are prominent in reviews with negative ratings. In general, a story in an app review would contain at least two types of events: user actions and associated app behaviors. Being able to identify such stories would enable an app's developer in better maintaining and improving the app's functionality and enhancing user experience. We present Caspar, a method for extracting and synthesizing user-reported mini stories regarding app problems from reviews. By extending and applying natural language processing techniques, Caspar extracts ordered events from app reviews, classifies them as user actions or app problems, and synthesizes action-problem pairs. Our evaluation shows that Caspar is effective in finding action-problem pairs from reviews. First, Caspar classifies the events with an accuracy of 82.0% on manually labeled data. Second, relative to human evaluators, Caspar extracts event pairs with 92.9% precision and 34.2% recall. In addition, we train an inference model on the extracted action-problem pairs that automatically predicts possible app problems for different use cases. Preliminary evaluation shows that our method yields promising results. Caspar illustrates the potential for deeper understanding of app reviews and possibly other natural language artifacts arising in software engineering.

Overview
Downloads

We publish the following datasets:

Paper
  • [Details][PDF] Caspar: Extracting and Synthesizing User Stories of Problems from App Reviews. Proceedings of the 42nd IEEE International Conference on Software Engineering (ICSE), Seoul, May 2020, 1–13.