Strengthening Data Science Methods for Department of Defense Personnel and Readiness Missions
The Office of the Under Secretary of Defense (Personnel & Readiness), referred to throughout the report as P&R, is responsible for the total management of all Department of Defense (DoD) personnel, including recruitment, readiness, and retention. This mission requires extensive data, a large number and variety of complex analyses, and access to skilled workers to extract meaningful information to guide DoD personnel and readiness policies. With the advent of newer sources of data, such as social media and modern data analytics, P&R has the opportunity to exploit new tools that may produce more powerful analyses and improve the effectiveness and efficiency with which it accomplishes its mission. However, cultural and technological challenges exist and must be addressed, including the following: improving data access and sharing while ensuring proper privacy protection, enhancing analytic methods, and improving workforce education. An important step in addressing these challenges is developing a data and analytics framework, taking into account current and desired capabilities and addressing barriers accordingly. This National Academies of Sciences, Engineering, and Medicine report of the Committee on Strengthening Data Science Methods for Department of Defense Personnel and Readiness Missions offers suggestions on which data analytics capabilities could be targeted and which considerations to keep in mind to advance the framework for these capabilities. The study’s full statement of task is shown in Box S.1. This report considers data science in its broadest sense: a multidisciplinary field that concerns technologies, processes, and systems to extract knowledge and insight from data and to support reasoning and decision making under various kinds of uncertainty. There are two primary aspects of interest within the field of data science, namely (1) the management and processing of data and (2) the analytical methods and theories for descriptive and predictive analysis and for prescriptive analysis and optimization. The first aspect involves data systems and data preparation, including databases and warehousing, data cleaning and engineering, and some facets of data monitoring, reporting, and visualization. The second aspect involves data analytics and includes data mining, text analytics, machine and statistical learning, probability theory, mathematical optimization, and visualization of results. Currently, analyses developed to support P&R are often disjointed, one-time efforts that respond to immediate questions and may lack any plan for future use of their data or methods. A comprehensive data and analytics framework, properly implemented, could add coherence to this work, expanding the types of questions that P&R can quickly examine, reducing the cost of analyses, improving the reliability of findings, and better informing policy decisions. While developing this framework, both the short-term and long-term needs of the Secretary of Defense and the responsibilities of P&R should be considered. The Force of the Future initiatives1 being pursued by Secretary of Defense Ashton Carter aim to make the DoD workforce more equitable, efficient, and flexible through a number of efforts such as increasing the interchange of personnel with the civil sector, offering more family-friendly benefits, changing how military personnel are promoted, and improving the opportunities for civil service personnel. One aspect of this would be the establishment of an Office of People Analytics to better harness DoD’s big data capabilities in the service of managing personnel talent. This would be done by increasing the understanding of personnel characteristics and analyzing how policy or environmental changes will affect the performance or composition of the workforce. The development of a data and analytics framework could revolutionize how data and analytics are used by P&R while contributing to the goals of the current Force of the Future initiative. Finding: Despite the substantial amount of data available on DoD personnel, the data may not be appropriate for DoD’s analytic tasks, or they may necessitate considerable investment in constructing the variables of interest. Finding: Analyses developed to support the Secretary of Defense are often disjointed, one-off activities undertaken to respond to immediate questions and may lack a plan for future use of data or analytic methods. Finding: The reuse of operational data for analytic purposes can expose issues in data collection, recording, transmission, cleaning, coding, and loading. Problems are often not detected until the point of analysis, when anomalies crop up in results. Recommendation 1: The Office of the Under Secretary of Defense (Personnel & Readiness) should develop a data and analytics framework, and a strategy to implement that framework, that addresses both the principal outcomes of its responsibilities and the short-term and long-term needs of the Secretary, based on the findings, recommendations, and discussions outlined in this report and in the Force of the Future proposals. Developing a data and analytics framework is a complex task, with many components that need to be addressed both individually and systematically. Data need to be easily accessible and shared across groups in a way that reduces the hurdles currently faced when researchers and analysts seek to find or share data while ensuring proper privacy and security protections. Analytic methods available to P&R need to be expanded to enable stronger and more rapid responses to significant P&R research and analysis questions. Prescriptive methods that would allow P&R to better assess alternatives and recommend actions could be used more extensively. The workforce that P&R relies on for its analytics also needs to be improved, which is a challenge facing organizations worldwide. Each of these components is briefly described in the following sections. The following sections also discuss potential short-, medium-, and long-term goals to help move P&R in the direction of developing a data and analytics framework. Data quality and sharing can be improved immediately, while data science methods can be enhanced in the medium term and data science education strengthened in the long term.