American Association of State Highway and Transportation Officials

Special Committee on Research and Innovation

 

FY2023 NCHRP PROBLEM STATEMENT TEMPLATE

 

Problem Number:  2023-A-02

 

Problem Title

Enterprise Data Warehouse Implementation Guide

 

Background Information and Need For Research

As part of a robust data governance strategy, agencies must decide how to best manage storage, access, and dissemination of data products and services both for internal use/reuse and external distribution within its data architecture. An important piece of a modern data architecture is an enterprise data warehouse, which, conceptually, will provide a way to reduce data redundancy, improve data consistency, and enable data usage for better decisions. Effective implementation of a data warehouse is complex, especially when an entity has highly diverse data sets and technology infrastructure. Thus, DOTs will benefit greatly from guidance on how to best architect and implement an enterprise data warehouse strategy that will meet diverse business needs while remaining performant, however, the industry lacks a guidebook on how to develop and implement a comprehensive data warehouse for a DOT.

 

The guidance should cover the complete set of functional requirements such as, but not limited to, federation (aggregating from multiple sources), data extract-transform-load (ETL) and extract-load-transform (ELT), storage, naming convention, structure (model), roles and responsibilities, web services, data publishing processes, access by third-party applications, and records management. Additionally, any conventions, models, and structure promoted by the guideline will need to support and align with standard frameworks used in transportation such as Industry Foundation Classes (IFC) to the greatest extent practical.

 

Literature Search Summary

The value of data warehouses for transportation has been in discussion since 1997  and the practice has continued to be developed and refined for specific business needs. However, the guidance has thus far been piecemeal, there is no whole guidebook on how to develop and implement a comprehensive data warehouse for a DOT.

 

In late 1990s to early 2000s development of data warehouses in DOTs included one focused on supporting a single business unit for a programming and scheduling data mart that included recommended approaches  and another with a broader scope of transportation data and an “analytical toolkit” that included a discussion of business requirements and the resultant design approach. 

 

During the more recent past, literature on data warehousing in the transportation realm has included design options for electric mobility , basic design needs for a maritime data warehouse , a regional data warehouse to support a real-time incident detection system, and a state-focused data warehouse designed for analyzing traffic safety improvements/advancements using eight integrated data sources related to health/medical, traffic incidents, driving records, court records, and census data, etc.  Notably, none of the these reports reference detailed guidelines that could be leveraged by a DOT to develop and implement a comprehensive data warehouse for the entirety of an agency’s data.

 

One recent research report evaluated open data portals for the “quality of data, ease of use, and availability of metadata” across many transportation agencies and found wide variations that highlighted the need for DOTs to improve data management and access to a more standardized and accessible benchmark for “ubiquitous audience”.  The report highlighted the best evaluated portal and its qualities but did not provide a guide for replicating the portal.

 

Finally, a NCHRP report on the effect of digitization on DOTs, which included a DOT survey, published in 2020 highlighted the need for new research or synthesis on “best practices of enterprise data warehouse management”. 

 

Research Objective

This research is intended to develop a guidebook on data warehouse implementation to support efficient use, sharing, and reporting of data and address transportation agency business needs. The deployable product from this research is a best practice guide for departments of transportation (DOT) to use in guiding the development and implementation of an enterprise data warehousing strategy that encompasses data in structured (tabular), semi- or un-structured (non-tabular), and geospatial file formats (includes geometry).

 

This study should broadly consider the needs of the state DOTs and provide guidance to assess data warehouse opportunities across the business needs within an organization and the strategies used to develop, implement, and integrate the use of data warehouses. It is expected that the guidance will address data in structured (tabular), semi- or un-structured (non-tabular), and geospatial formats.

 

The product should provide guidance on:

           Strategies to assess current data warehouse capabilities and opportunities.

           A glossary of terms that provides definitions of terms and operating procedures keeping in mind diverse operational contexts in different state DOTs.

           Identification of essential data warehouse capabilities (input, access, cybersecurity/permission management, privacy, data quality, etc.)

           The data management strategies and technology needed to provide capabilities for the development and implementation of data warehouses.

           Factors critical for the adoption of data warehouses by agencies and the business functions the warehouse supports. This is anticipated to include surveys or interviews with state DOT Chief Data Officers or equivalent, as well as business domain experts for defining reporting and analytical requirements.

 

Urgency and Potential Benefits

Data resources have been developed by independent business areas to meet their needs that has often resulted in data silos and challenges in data access for analysis and/or effective and timely decision making. Business practices have evolved to become more interdisciplinary which is increasing the need to access data across business units. Data warehousing provides a cost-effective strategy to align and access cross-organization data resources. Data warehouses support efficient analysis and reporting by providing a single and stable source of integrated data that is commonly needed by decision makers. With limited resources for data management, data warehouses provide a cost-effective means to support the multifaceted needs of our state DOTs.

 

Benefits of data warehousing includes:

           Daily decision making is positively impacted by improving data quality and increasing accessibility and security through using inventory best practices.

           Improve organizational productivity through improved interoperability of data and automation of common analysis and reporting activities.

           Value is realized through time and cost savings to agencies by providing guidance on enterprise data warehouse strategies that can be quickly applied to improve the current state of data, rather than duplicating efforts with multiple or unknown outcomes.

           Sharing data across business units and agencies will be timelier and more efficient through adopting common standards.

           Build and maintain a historic record of agency data that is separate from operational data stores that are highly dynamic.

           Improve efficiency of design and implementation of collaborative efforts such as the AASHTOWare product portfolio.

           Enable development of common performance metrics 

 

Implementation Considerations

Each state DOT needs to manage its information systems within their state requirements.  This guidance will provide useful information for agency policy and procedure to support their strategies for data warehouse development and management.

 

To support implementation, research activities and products will be shared through:

           The AASHTO Committee on Data Management & Analytics meetings, forums, webinars, and website

           The TRB Standing Committee on Statewide/National Transportation Data and Information Systems

           The TRB Standing Committee on Information Systems and Technology

           AASHTO GIS-T symposium, ITE conference, and USDOT meetings and webinars

           Outreach to other AASHTO committees that would benefit from the guide

 

Recommended Research Funding And Research Period 

Recommended Funding: $350,000

Research Period: 18 months

 

Problem Statement Author(s): For each author, provide their name, affiliation, email address and phone.

Buffy Conrad, Enterprise Data Governance Manager, Maryland State Highway Administration, BConrad@mdot.maryland.gov, 410-545-8405

 

Leni Oman, Knowledge Strategist, WSDOT, omanl@wsdot.wa.gov, 260-705-7974

Michael Pipp, Chief Data Officer, Montana Dept. of Transportation, mipipp@Mt.gov, 406-444-6060

 

Jack Dartman, Supervisor/Architect, Montana Dept. of Transportation, jdartman@mt.gov, 406-444-7937

 

Chad Baker, Geospatial Data Officer, Caltrans, chad.baker@dot.ca.gov, 916-247-1625

 

Potential Panel Members: For each panel member, provide their name, affiliation, email address and phone.

Michael Pipp, Chief Data Officer, Montana Dept. of Transportation, mipipp@Mt.gov, 406-444-6060

 

Jack Dartman, Supervisor/Architect, Montana Dept. of Transportation, jdartman@mt.gov, 406-444-7937

 

Chad Baker, Geospatial Data Officer, Caltrans, chad.baker@dot.ca.gov, 916-247-1625

 

Denis Whitney-Dahlke, Strategic Data Program Manager, Oregon Department of Transportation, Denise.D.WHITNEY-DAHLKE@odot.state.or.us, 971-719-6274

 

Person Submitting The Problem Statement: Name, affiliation, email address and phone.

Chad Baker, Geospatial Data Officer, Caltrans, on behalf of the AASHTO Committee on Data Management and Analytics, chad.baker@dot.ca.gov, 916-247-1625