Abstract
Users’ growing concerns about online privacy have led to increased platform support for transparency and consent in the web and mobile ecosystems. To that end, Android recently mandated that developers must disclose what user data their applications collect and share, and that information is made available in Google Play’s Data Safety section. In this paper, we provide the first large-scale, in-depth investigation on the veracity of the Data Safety section and its use in the Android application ecosystem. We build an automated analysis framework that dynamically exercises and analyzes applications so as to uncover discrepancies between the applications’ behavior and the data practices that have been reported in their Data Safety section. Our study on almost 5K applications uncovers a pervasive trend of incomplete disclosure, as 81% misrepresent their data collection and sharing practices in the Data Safety section. At the same time, 79.4% of the applications with incomplete disclosures do not ask the user to provide consent for the data they collect and share, and 78.6% of those that ask for consent disregard the users’ choice. Moreover, while embedded third-party libraries are the most common offender, Data Safety discrepancies can be traced back to the application’s core code in 41% of the cases. Crucially, Google’s documentation contains various “loopholes” that facilitate incomplete disclosure of data practices. Overall, we find that in its current form, Android’s Data Safety section does not effectively achieve its goal of increasing transparency and allowing users to provide informed consent. We argue that Android’s Data Safety policies require considerable reform, and automated validation mechanisms like our framework are crucial for ensuring the correctness and completeness of applications’ Data Safety disclosures.