American Association of State Highway and Transportation Officials

Special Committee on Research and Innovation

 

FY2023 NCHRP PROBLEM STATEMENT

 

Problem Number:  2023-A-15

 

Problem Title

Managing Privacy Issues When Using and Storing Connected Vehicle Data

 

Background Information And Need For Research

Connected vehicles have the potential to provide a rich source of data for state and local transportation authorities.  The data will be so rich that unethical actors could use it to exploit or violate the privacy of travelers.  This research intends to help state and local transportation authorities gain the most benefit from the data while minimizing their liability for traveler privacy. 

Connected vehicle data will go much further than the typical counts from point sensors.  Information in the basic safety message (BSM), transmitted 10 times per second, can provide count, speed, travel time, vehicle density, vehicle length, status of windshield wipers and lights, perhaps even origin-destination all at high spatial and temporal resolution in real-time.  Authorities may even infer very localized road weather conditions from the status of lights, wipers, and brakes, all contained in the BSM.  Such data can enable a host of new applications, operational capabilities, and provide new insights for transportation authorities.

Even though the security certificates and broadcast identifiers are supposed to change frequently to make it harder to track specific vehicles, it won’t take much to correlate such data with individuals as the GPS coordinates are not encrypted and are transmitted in the clear.  Therefore, there is a risk to individual privacy where this data is collected and stored.  To avoid litigation and other problems, entities that intend to collect, store, and use connected and automated vehicle data need to proactively prepare to secure this kind of data.  Threats include hackers stealing, altering, or destroying data; insiders abusing it; mistakes causing inadvertent exposure; and inadequate anonymization that leads to privacy violation.  Agencies cannot retroactively install protections against such threats.  The threat environment is too sophisticated so developers and data users must consider data security and privacy from the beginning. 

Another hurdle for government agencies is public opposition to governments collecting information despite an ironic not well-informed consent to private companies collecting personal cellphone data.  Public concern for privacy, hence pressure on legislators, is growing though and will increase the costs of mishandling data that has an impact on individual privacy in the future. 

Areas to consider must include, encryption, anonymization, data retention policies, access control, sharing policies and possibly more.

Research questions to address include:

1.         How to handle the scale of data from millions of vehicles ten times a second?  Should data be aggregated at the point of collection?  Should one collect samples or all raw data? 

2.         What processing should occur at collection, at the edge, in the cloud, if any of these are in the data architecture and how can the data be protected at each point and the communications between them? 

3.         Do the applications using the data matter to how the data must be collected, transferred, processed, and stored?  Or are any protections independent of data use? 

4.         How does scale affect the ability to protect the data versus the utility of the information?  Is it necessary to collect it all?  At what point does scale make it too hard to protect? 

 

Literature Search Summary

Paige M. Boshell, “The Power of Place: Geolocation Tracking and Privacy,” Business Law Today, American Bar Association, March 25, 2019,  https://businesslawtoday.org/2019/03/power-place-geolocation-tracking-privacy/

A comprehensive treatise on the exploitation of personal location data, market forces, and current state of Federal and State regulation. 

U.S. Senate Committee on Commerce, Science, and Transportation, “Protecting Consumer Privacy,” Hearing, September 29, 2021,  https://www.commerce.senate.gov/2021/9/protecting-consumer-privacy

The latest in a long history of Congressional activity trying to deal with privacy. 

Bill Chappell, “Uber Pays $148 Million Over Yearlong Cover-Up Of Data Breach,” NPR, September 27, 2018,  https://www.npr.org/2018/09/27/652119109/uber-pays-148-million-over-year-long-cover-up-of-data-breach

One example of the cost of a transportation-related data breach and its subsequent mishandling. 

Sydny Shepard, “The Average Cost of a Data Breach,” Security Today, Jul 17, 2018,   https://securitytoday.com/articles/2018/07/17/the-average-cost-of-a-data-breach.aspx

Cybertalk.org, “Ransomware attacks on the transportation industry, 2021,” July 28, 2021,  https://www.cybertalk.org/2021/07/28/ransomware-attacks-on-the-transportation-industry-2021/

The transportation industry witnessed a 186% increase in weekly ransomware attacks.  Those kinds of attacks include threats to release data with privacy repercussions. 

Gina Tonn et al., “Cyber Risk and Insurance for Transportation Infrastructure,” Wharton School, University of Pennsylvania, Risk Management and Decision Processes Center, Working Paper # 2018-02, March 2018,  https://riskcenter.wharton.upenn.edu/wp-content/uploads/2018/03/WP201802_Cyber-Security-Transportation-Sector.pdf

Malicious data breaches make up 27.1% of all incidents resulting in an average loss of $0.33 million. Privacy-related incidents also have a very high occurrence frequency accounting for 22.9% of incidents, with average losses over $1.5 million, which is more severe than the losses caused by malicious data breaches.  Among all the incident types, unintentional disclosure of data has the most destructive impacts on victim companies with an average loss of $3.17 million. 

 

Research Objective

The objectives of this research are to:

a)         Identify the risks to transportation authorities of collecting data that can lead to privacy violations.

b)         Identify processes, techniques, methods, design requirements, activities, and policies to prevent and mitigate potential privacy violations.

c)         Create a guidance document for transportation authorities to help them address the privacy issue proactively. 

 

Urgency and Potential Benefits

Connected vehicle data can supplant existing traffic data collection efforts, sparing agencies much of the cost of data collection and maintenance.  Because the data will have so much higher resolution and cover all roads, not just a few sample points, it will enable a host of new services and capabilities as noted above, from operations and planning to policy.  Such widespread impact is as difficult to quantify as the potential of the internet was thirty years ago, other than to say it will be significant.  However, there are some obvious potential benefits.  One is to displace some of the $50 to $100 million per year spent on traffic counting programs.  Another is the ability to more precisely predict where transportation agencies will see the most growth, which can help with prioritizing funding. 

There is an opportunity cost to not being able to capitalize on data available from almost every vehicle.  But privacy violations cost the public trust as well money.  Data security specialists estimate that companies in the U.S. lose an average of almost $8 million per incident due to security breaches and that cost is increasing almost five percent a year .  Transportation agencies can less afford these risks. 

This need was first identified by the TRB Traffic Monitoring Committee ACP70 in 2016.  Although there have been advancements in this area since then, the need still exists and is even more prevalent with advanced technology and the development of new data sources from vendors, for example, using video recognition with AI and machine learning technologies.

Being able to leverage high-quality traffic count data from continuous counting equipment is paramount for traffic volume count programs and allows agencies to improve facilities and safety based on traffic volume data that shows demand and user facility needs. 

Those benefiting from this research include state, county, MPO, and city traffic program managers; signal operating agencies; and the public at large (due to resource saving and increased safety and mobility due to improved data and decision-making).

 

Implementation Considerations

           Consider legislation to ensure that the practices outlined as a part of this research are adequately implemented

           Standards are under development for a pedestrian safety message (PSM), which would perform the same function for vulnerable road users like pedestrians and cyclists that the BSM does for vehicles.  Because these handheld devices would also transmit location information agencies could collect this data over the air as well.  Would the handheld transmitter simply be a smartphone or instead a device similar to what is in vehicles?  Would the policies and procedures for securing vehicle-based data apply directly to pedestrian devices or would implementation of collection, aggregation, and anonymization need to be adjusted for them? 

 

Recommended Research Funding and Research Period

    $400,000 over 24 months

 

Problem Statement Author(S): For each author, provide their name, affiliation, email address and phone.

           Alan Chachich, Volpe, (617) 494-3530, alan.chachich@dot.gov

           Chris Vaughan, North Carolina State University, (919) 515-8036, clvaugha@ncsu.edu

           Lawrence A. Klein, Klein & Associates, (714) 356-2275, larry@laklein.com

 

Potential Panel Members: For each panel member, provide their name, affiliation, email address and phone.

None at this time. 

 

Person Submitting The Problem Statement: Name, affiliation, email address and phone.

Kent L. Taylor, North Carolina Department of Transportation, (919) 345-9829 mobile, kltaylor@ncdot.gov