Data from bus and rail intelligent transportation systems (ITS) are a valuable resource for transit service planning and management. In particular, vehicle location and passenger activity data from automatic vehicle location (AVL), automatic passenger counter (APC), and automatic fare collection (AFC) systems can be used to provide essential insight into transit operations and to inform decision making to increase the efficiency, productivity, and safety of transit service. There are, however, significant challenges for transit agencies in accessing and using this data. Many agencies cannot get to the data at all or do not understand the data they have. Data validation and quality control, integration and matching across various data sets, and aggregating data are all difficult, as is developing the types of reports, tools, and analytics that contribute to informed decision making. Even when transit agencies, researchers, and consultants do address these challenges, they often have difficulty sharing their work with their peers in the industry because the same types of data may be managed differently among transit agencies. The result is that transit ITS data is rarely used to its full benefit.
Creating a common approach to accessing and managing transit ITS data would facilitate the development and exchange of data management practices, of advanced reports and tools, and of new analytical techniques among transit agencies. The use of the General Transit Feed Specification (GTFS) format as the basis for a wide variety of service planning and customer information tools is a useful comparison. By providing a common way of representing schedule data, GTFS has facilitated significantly faster development of tools than would otherwise occur if every transit agency had their data in different formats. As a result, there is a recognized need for research to generate similar advancements in the use of transit ITS data, replicating improvements comparable to what GTFS has achieved for schedule data.
The objective of this research is to develop a common, practical approach to storing, accessing, and managing fixed-route transit ITS data. This data management approach should address the following:
1. Current data sources, access, quality, and integration challenges;
2. Relationship between historical schedule data and ITS data;
3. Differing needs of transit agencies and other users of transit ITS data;
4. Optimal configuration allowing regular improvement and sharing across the industry;
5. Exchange of reports, tools, and analytical techniques based on transit ITS data; and
6. Methods and procedures for developing operations management reporting and decision-support systems.
Note: For the purpose of this research study, the scope of ITS data elements include the following core data sets:
1. AVL data that includes stop and time-point arrival and departure times, intermediate vehicle location observations, and other vehicle event data;
2. APC data that includes boarding and alighting counts by trip and stop; and
3. AFC data that includes individual fare transactions.
Where possible, these data elements should be matched with schedule service information. The data management approach should allow expansion in the future to accommodate other transit operational data.
The outcome of this research study should:
1. Support transit agencies in accessing and using their ITS data,
2. Prepare transit agency staff for the world of big data management and analysis,
3. Improve understanding of policy implications of changing data structures on data governance and vendor relationships,
4. Promote development and sharing of tools and reports based on ITS data,
5. Promote research using ITS data, and
6. Bottom line: improve quality, efficiency, and safety of transit service.
To address the objective, two key areas require research and development, divided into two phases.
Phase I: Data Management Approach
The first area of research is to recommend a data management approach for storing, accessing, integrating, managing, and using ITS data—a management approach that can be applied successfully across different transit agencies (different size and staffing skills), different modes of service, and different source ITS systems. This data management approach includes transferring data from source systems, defining historical schedules, and matching ITS data to the schedule. It will also identify issues and opportunities for integrating data sources and improving data quality, determine the structure in which the data will be stored, and demonstrate how users and software tools will access the stored data.
Building this data management approach will require interviews with transit agencies, ITS vendor staff, academic researchers, and other stakeholders: what data they collect; how they use and would like to use this data; an analysis of sample data sets; how they approach the design and review of data structures; and data governance. The research will also include review and analysis of existing industry frameworks and data standards in the United States and abroad. The research team will work closely with the TCRP project panel to structure the information gathering and review process. Phase I will conclude with an interim report and an in-person meeting with the panel and the research team. The interim report will provide a detailed summary of all work completed under Phase I, a description of alternative approaches to building a data structure for collecting and managing ITS data, and a detailed scope of work for Phase II. TCRP approval of the interim report is required prior to moving into Phase II.
Phase II: Specifications for Data Structure and Processing Tools
The second area of research will define the data structure and develop detailed functional requirements for two sets of tools. This data structure will provide a standard set of files and fields for fixed route transit ITS data. In support of this data structure, the first set of tools will be intended to transfer data from source ITS systems into the data structures. The second set of tools will be intended to validate data quality and calculate basic derived values (e.g., travel time, actual stop arrival and departure, vehicle load, and other factors). The data structure and associated tools should also help organize and summarize data for reporting any operations management needs.
Note: The actual tools will not be developed as part of this study. The detailed functional requirements will be designed for use by individual transit agencies to build analytical tools applicable to specific and unique requirements of that agency.
Phase II should include a validation procedure: identify a set of transit agencies representing various characteristics and work with these agencies to evaluate the data structures and detailed functional requirements, addressing various transit modes. Based on this interaction, recommend a process to maintain the data structure and disseminate the research results across the transit industry.
The research plan built in appropriate checkpoints with the TCRP project panel including, at a minimum: (1) a kick-off teleconference meeting to be held within 1 month of the contract’s execution date; (2) the face-to-face interim review meeting to be held at the end of Phase I; and (3) at least two additional teleconferences tied to TCRP review and approval of any other interim deliverables as deemed appropriate.
Final deliverables include:
1. A final report that documents the entire research effort, and description of results of interviews, surveys, validation procedures, and other information gathering activities;
2. Documentation of the data structure and detailed functional requirements along with guidance on use, application, and maintenance;
3. A stand-alone executive summary that outlines the research findings and recommendations;
4. Presentation materials aimed at transit agencies and others, including both staff and senior management, that simply and concisely present the data management approach and explain effective data management procedures, agency benefits, and institutional requirements; and
5. An executive-level brochure/infographic highlighting findings of the research or some other innovative approach to facilitate dissemination to a broad audience, particularly affected transit agencies.
Final deliverables also include a stand-alone technical memorandum entitled, “Implementation of Research Findings and Products.”
Status: Complete. TCRP Research Report 235: Improving Access and Management of Public Transit ITS Data proposes a data structure for storing data from bus and rail intelligent transportation systems (ITS).
Supplemental to the report are an Overview Presentation, a “How To” Presentation, and an Executive 2-pager on the benefits of the proposed data structure.