The rise of the data steward
by Sarah Wray, Editor, Cities Today
As data use and collaboration become more advanced, there is a need for a new profession within the public and private sectors, says Stefaan Verhulst, Co-Founder and Chief Research and Development Officer at New York University’s The GovLab. He calls this role the ‘data steward’ and is also seeking to expand existing definitions of the term.
While many cities, government organisations, and private sector companies have chief data officers and chief privacy officers, Verhulst says this new function is broader and necessary as more organisations begin to explore data collaborations which bring together data from various sources to solve problems for the public good.
Many cities, for instance, want to get more value and innovation from the open data they share, and are also increasingly partnering to benefit from private sector data on mobility, spending, and more.
Several examples highlight the challenges, though. There have been disputes about data-sharing and privacy, such as between Uber and the Los Angeles Department of Transportation, while other initiatives have failed to gain traction. Copenhagen’s City Data Exchange facilitated the exchange of public and private data but was disbanded after it struggled to get enough data providers and users on the platform and to become financially sustainable.
Missing links
Verhulst says that beyond ensuring the security and integrity of data, new skills required by data stewards include the ability to secure partnerships, adequately vet data partners and set up data-sharing agreements, as well as the capacity to steward data-sharing initiatives internally and obtain legal and executive buy-in. Data stewards should also develop financial models for data-sharing to ensure partnerships are sustainable over time.
“That’s quite often ignored,” says Verhulst. “It’s assumed that these things will pay for themselves. Well surprise, surprise, there are costs.”
In addition, there’s an important role for retaining an active focus on insights from data and problems to be solved. Many early open data efforts have taken a ‘build it and they will come’ approach, and usage at scale hasn’t always materialised.
A dynamic regulatory environment is also driving demand for new skills, says Verhulst, noting that the proposed EU Data Act indicates a mandate “to knock on the doors of the private sector [for data] in emergency contexts”.
“The question is: how do you go about that?” Verhulst comments. “Many organisations are going to have to figure this out.”
The new data stewards
The GovLab is now running the third cohort of its training for data stewards, and the first focused in the Eastern Hemisphere.
The Developing a Data Reuse Strategy for Public Problems course is part of The GovLab’s Open Data Policy Lab, which is supported by Microsoft.
Participants include Leah Dowsett, Medical Geneticist at the Association of Native Hawaiian Physicians; Bindu Sharma, Managing Director for APAC at the International Centre for Missing and Exploited Children, Singapore; and Lucy Tyndall, Senior Advisor at the New Zealand Ministry for the Environment, New Zealand.
Previous participants include delegates from the City of San Jose, Swisscom Blockchain, UNICEF, and the National Statistical Institute of Uruguay.
“It’s a vast variety of participants,” says Adrienne Schmoeker, Senior Fellow at The GovLab and former Deputy Chief Analytics Officer for the City of New York. “We are really trying to have diversity of organisations, regions and sizes of companies because there’s also a lot to be learned from each other.”
An open data first in Nepal
Among the latest cohort is Susmina Manandhar, Open Data Fellow at technical services company NAXA and Changunarayan Municipality, Nepal, where she is working on launching an open data portal alongside Siddhanta Neupane, the municipality’s ICT officer.
As well as being a first for the municipality, which was formed in 2017 through a merger of former village development committees, Manandhar says the initiative is also the first of its kind across Nepal. It will create a “one-stop data warehouse,” where all datasets from the municipality are available via a single platform for internal staff, the public, researchers and the private sector.
The integrated database management system and open data platform will bring together data from various disparate systems, including health, education, and agriculture.
“At first it was my perception that this is simply about collecting data and just uploading it,” Manandhar told Cities Today. That was until she learned more about the concept of data stewardship.
“We have to manage so many things , such as we need to know what sorts of datasets the users are trying to collect from the government. We need to maintain the data quality, maintain the data standards, and look at accessibility and availability, and maintain its accuracy as well,” she says.
“I’m really looking forward to learning more about this and applying the learning and experiences from this course to our open data portal.”
Culture change in Syracuse
Nicolas Diaz, Chief Innovation & Data Officer, City of Syracuse, NY, took part in the first data stewards’ cohort in November 2020, shortly after he took on his role and was grappling with many of the issues the training addresses.
“My main takeaway was understanding that there’s a whole value chain around data,” he says. “People tend to assume that data is all about getting a tonne of datasets and then letting the AI run wild to find connections, or alternatively they think it’s all about deploying futuristic technology in the city and that will translate automatically into better outcomes.
“Participating in this programme dispelled that myth and [developed my] understanding that you really need to think about data holistically.”
He said he also discovered tools to assess and address the quality of data, engaged with deeper ethical dimensions, and began to consider issues such as the sustainability of collaborative data models.
For instance, a project in Syracuse that preceded Diaz’s appointment saw the city using mobility data from Facebook to inform public health decision-making, with support from the COVID-19 Mobility Data Network, a group of infectious disease epidemiologists from universities around the world.
Data stewardship concepts such as “publishing with purpose” are playing a key role in strategic planning and new initiatives within Diaz’s office.
The city is moving from a system of siloed departmental data sources and systems to an integrated cloud-based system.
“We can do a lot more,” Diaz says. “And that’s where the concept of data reusability comes in.”
One project focuses on improving the operational capacity of the water finance department which currently uses manual paper-based processes.
Diaz comments: “We want to use our data infrastructure to make that more straightforward and save money and time, but at the same time as we’re connecting that data infrastructure, we also want to build analytical databases which allow us to do a lot more.”
This could include historical analysis and potentially AI-powered predictions that flag where there is water leakage based on the billing.
“But we cannot get there unless we have that base layer of infrastructure,” Diaz says.
Tracking investment outcomes with data
Syracuse received US$123 million in American Rescue Plan Act (ARPA) funding, and Diaz’s office is part of the core team for this, including the creation of a performance management framework.
As well as tracking allocation, planning and implementation, projects must also have outputs and outcomes, including equity metrics in many cases.
“This is something that’s quite revolutionary for us…we were not used to thinking around data in this way,” says Diaz, and it’s likely to represent a long-term culture change.
“This is a huge pot of money for us — almost half of our yearly operating budget. We want it to transform our community but we also want to use it to transform our internal city practices.”
The city has also launched a surveillance technology working group, which includes city representatives, community stakeholders, and external and industry experts.
“That starts putting a lot of oversight around [any] potential new data collection capabilities of the city, which is something we’ve never done or thought about before,” comments Diaz.
While he says there is some overlap between a chief data officer and a data steward, “that multi- faceted approach is what’s important and maybe the responsibilities of the data steward will be shared among many different folks.”
This article first appeared on Cities Today.
Image credit: Claudio Schwarz via Unsplash