Access Lab 2020 panel discussion: Privacy vs personalization
The panel debate is always a highlight of the annual Access Lab conference. This year’s was no exception with a topical debate around the many and varied issues on privacy vs personalization.
The topic of privacy vs personalization was chosen for 2020. A coordinated response is needed to manage the challenge of balancing individual privacy rights against organizations’ use of data to personalize digital services, experts at Access Lab 2020 agreed.
Professionals from academic publishing and library services also agreed education and transparency were vital. They discussed how to manage the tension between protecting user privacy and collecting and analyzing data to improve and curate services for users based on their online behavior and preferences.
Jon Bentley, commercial director at OpenAthens chaired the live-streamed discussion between Peter Reid, digital services librarian at Bath Spa University, Sebastien Kohlmeier, senior manager of business operations Allen Institute for AI (AI2) and Ganesh Gupta, a student partner of Jisc.
Aligned thinking
While each participant bought a unique perspective to the discussion there was broad agreement on the following points:
- Any personal data to be carefully protected and organizations to offer complete transparency about the way it will be used.
- Better education for students about online data use, the value of their personal data and our increasingly data driven society.
- Universities and their libraries need to work together to ensure third party suppliers are collecting or handling student data appropriately.
- A coordinated approach across publishing, libraries, institutions and even government and policy makers to guide and advise on compliance and create an ethical framework.
The role of university libraries
Jon Bentley kicked off by pointing out the answer to most of the questions on user centric issues do not lie with the service providers but with the end users.
We need to look beyond our own preconceptions and talk to the people who actually use and engage with the software.
Peter Reid felt universities should, in principle, provide a safe space for expression and learning where students are free from their choices and behavior being monitored.
I like the idea of the university or library as sanctuary,
He saw university libraries’ role as intermediaries making “responsible and ethical” choices in the interests of their students when it comes to access management and privacy.
While acknowledging challenges around how to manage the policies and actions of many third party suppliers upon which the university and library services rely, he said:
My interest is in preserving the place of trust that students have and at least at first, keeping the data under our control.
He feels it is the job of university libraries to monitor how much personal data is demanded by their suppliers, especially if made a condition of access.
There’s a question over who is competing to own research or user identities. “There’s an issue of services that have a lot of personalization and they have a strong identity service themselves in terms of the user account.” He added: “Let’s talk about this as a community, because it would have to be built into contracts and contractual discussion when we buy these acquisitions. If this was built into that a bit more, then that would be a market driven change.
Personalization for user benefit
Sebastian Kohlmeier, who works on an academic search and discovery engine called Semantic Scholar, which collects user data in order to personalize its service, said the Allen Institute took its duty to protect user data very seriously.
He said the institute did as much as possible not to share details of user activity on the site and to ensure personally identifiable information stayed within their systems.
He acknowledged this was helped by the fact the institute did not need to generate revenue which freed them of commercial pressure to sell or share data.
In terms of tracking, we try to be very transparent with our users in terms of what we do from that perspective
The user experience
Speaking from a user perspective, Ganesh Gupta, a final year business management student at ARU (Anglia Ruskin University), said the key issues were transparency and education.
He called for better education of students about the way their data was being harvested, especially as universities moved towards federated access rather than traditional IP access.
Why, and for what purpose my data is kept and passed on, is something that needs education. Starting my course, I began using academic literature, just peer reviewed journals and books to make my arguments more substantial and balanced. However I was not fully cognizant of how my data was used and how the university dealt with the publishers and databases as and when I as a user accessed these resources. So outside of academic settings I have frequently shared my email and phone number with organizations, but now to a greater extent I acknowledge there’s a concern with what data I share.
He asked universities and publishers to work together and put the user first, adding:
To solve these kinds of problems we need cooperation and have to ask what users want as well.
Tracking and data collection
The practical benefits of tracking user behavior were highlighted by Sebastian as he explained how Semantic Scholar uses such data to personalize the feed of research papers within user accounts.
But he reiterated the importance of collecting data responsibly with user IDs that cannot be traced to individuals and transparency over data use.
If you are logged in we do collect information that users provide, but we are then very explicit about how we use that to surface recommendations. “If users are rating papers, for example, we have a personalised feed and we will use that information to tune their feed of research papers and related features on the site,” he said. The institute has also shared anonymised data where it is being used in research for the “common good”. “Our mission as an organization is AI for the common good. So we only share data in cases where that’s the specific outcome that someone is looking for.
Peter said how organizations use data is a key consideration for universities when signing contracts but he also flagged that universities themselves are increasingly keen to use the data to “optimize the student experience.”
We are under a lot of pressure to do that. We are more and more subject to market forces.
He made it clear that sharing data in some instances was in the user interest but not always and pointed to the pressure users often come under from digital service providers to share their details.
I’ve used the analogy of a book, you know, it's almost like the books are kind of jumping off the shelves, trying to get the users attention. That’s an aspect of the world wide web that we’re all in. Anyone who owns a website is competing for attention.
He expressed a desire for universities to educate students on personal data sharing and use claiming it was one of the universities’ “cross curricular priorities.”
Ethical considerations
Talking about the risks of mishandling of data, Jon warned:
The use of data could move from something that powers insight and adds value to something that is almost surveillance. There’s a sinister aspect of the ability to track students phones, to know where they are.
While everyone agreed on the need for clearer guidance and an ethical framework to inform decisions in the future, there were varied views on who should take ownership of the data policies.
Peter felt that while governments should take a lead on data protection overall university libraries should be taking a lead on data sharing policies for access to resources within universities as they can be a robust voice in implementing the principles within the realm of resource access. Sebastian felt responsibility lay with governments to legislate and provide guidance while Gupta called for a coordinated response that better educated the users about what they are sharing.