June 15, 2023

Data Privacy

Findings from the CNAS Technology Policy Lab

Protecting consumer data privacy is at the center of many of Washington’s technology-related debates. Apps such as TikTok, for example, raise fears about how foreign adversaries might collect and use American consumers’ data to further the aims of authoritarian regimes, posing risks to civil liberties and national security.¹ The targeted advertising practices of big tech companies, such as Google and Meta, have likewise generated bipartisan interest in comprehensive data privacy legislation, given how businesses can often collect information on individuals without their consent.² More recently, ChatGPT has spurred new urgency in Congress, in the Biden administration, and among business leaders to address the data privacy risks posed by artificial intelligence (AI) systems, including leaks of personal and sensitive information.³

These examples highlight the real-world challenges of protecting consumer data. They also underscore the issues that spring from the United States’ fragmented approach to data privacy, in which existing privacy laws are not always applied consistently, depending on the specific data types, states, or localities involved.⁴ The Federal Educational Rights and Privacy Act (FERPA), which protects the privacy of student education records at the federal level, is an example of a law that applies nationally but only in some contexts. Regionally, there are at least 54 different laws for data breach notifications that vary across the United States, which makes it challenging to understand and protect privacy rights in a comprehensive manner.⁵ In contrast to these data- and region-specific privacy laws, today’s data environment—characterized by an explosion of available data and rapidly evolving data-driven technological advancements—necessitates clearer and more consistent guidance for companies to protect consumer privacy.⁶

To address regulatory gaps and better streamline existing laws, Congress should pass a national comprehensive data privacy law. Although comprehensive legislation gained bipartisan momentum in the proposed American Data Privacy and Protection Act (ADPPA), it is unlikely to pass before the 2024 elections. Many factors have resulted in Congress’ stalemate; these include limited political capital to push for reforms, pushback from representatives who argue state laws would be weakened by a federal law that preempts them, difficulty scoping consumer consent frameworks, and cost burden for implementation—particularly for small- to medium-sized companies.⁷

Today’s data environment—characterized by an explosion of available data and rapidly evolving data-driven technological advancements—necessitates clearer and more consistent guidance for companies to protect consumer privacy.

These factors also have caused the United States to cede leadership in shaping democratic norms for data privacy to the European Union (EU), whose General Data Protection Rule (GDPR) went into effect in 2018.⁸ Although GDPR has its shortfalls—including overly burdensome requirements for organizations and loose restrictions on data brokers—the regulation at least provides a set of rules that companies are expected to follow, and consumers can count on, for protections.⁹ As a result, other countries—including authoritarian regimes like China and democracies such as Japan and South Korea—have chosen to align with GDPR while the United States works on passing its own comprehensive data privacy law.¹⁰

Recognizing the complexity of the U.S. data privacy landscape and its impact on the United States’ ability to lead, CNAS convened a working group to discuss the following questions, with the goal of improving data privacy protections without oversimplifying them:

What are the benefits and shortcomings of U.S. data privacy laws that already exist?
How can Congress improve data management—which encompasses the collection, use, and storage of data—at the state level and within private companies?¹¹
Since data drives many critical and emerging technologies—such as AI and quantum computing—what impact does comprehensive data privacy have on America’s technological development and innovation capacity?

In lieu of a national comprehensive data privacy law in the near term, Lab participants identified steps that Congress and the Biden administration can take to improve the U.S. approach to data privacy protections. Specifically, Congress and the administration can focus on creating clear and consistent processes for data collection and data documentation, both of which are currently handled within individual companies. Consistent processes will help companies navigate various state-level frameworks and create accountability measures for responsible, safe, and transparent data management. Additionally, they will address national security–related concerns, such as competing with China’s authoritarian approach to data collection and use.

Within data collection processes, policymakers should prioritize clarifying a U.S. approach to managing data volume, meaning how companies determine the amount of data they will collect. Policymakers struggle to address questions surrounding data volume because many of them do not have the technical expertise to discuss what appropriate levels of data collection should look like. Absent a consistent approach to data collection, companies are tasked with addressing data volume questions themselves. This often leads to two scenarios: they either collect too little and therefore do not have the quality of data needed to meet their business objectives; or they collect more data than required to meet their goals, which may infringe on consumer privacy rights. Alternatively, companies may follow GDPR as the de facto standard, which poses a question for policymakers about whether they would like American companies to follow the EU’s lead. While advancements in machine learning may help companies mitigate the impact of these data volume challenges—given it can help users identify useful data more efficiently than manual processes—the lack of a standard process for data collection would still yield unintended consequences for consumer privacy. For example, differing data collection processes across departments can create silos within a single company. This inhibits transparency within an organization and among consumers, who would not have a reliable picture of how much of their information a given company collects and uses.

Congress and the administration can focus on creating clear and consistent processes for data collection and data documentation, both of which are currently handled within individual companies.

Managing data volume is also a national security imperative. Rapid advancements in large language models (LLMs), such as ChatGPT, and heightened strategic competition with China (vis-à-vis data-driven technologies that include AI and quantum computing) require U.S. policymakers to think carefully about the strategic importance of data. China recognizes its value, as demonstrated by the Chinese government’s recent restrictions of overseas access to China-based data sources.¹² This action—spurred by U.S. think tank reports—further protects the mass data collection China has prioritized for the Chinese Communist Party’s interests since 2013, when President Xi Jinping took office.¹³ The United States should not replicate the Chinese model of data collection, which is built upon laws that permit the Chinese government access to any citizen’s or company’s information for national security or intelligence purposes.¹⁴ However, the United States also must recognize data as a “strategic asset” and act accordingly, balancing impacts related to consumer harms and national security interests in any new data privacy policies or regulations.¹⁵

The second process that policymakers should address is data documentation, which is defined by the World Bank as “the process of recording any aspect of project design, sampling, data collection, cleaning, and analysis that may affect results.”¹⁶ Clear data documentation processes are often missing from company workflows because individual organizations must create them and then manage their data internally, which is extremely resource intensive. However, data documentation processes are essential to assuring consumers that collected data is being managed responsibly. A federal-level data documentation process would therefore remove the burden of transparency from companies by setting standard expectations for organizational workflows.

Data documentation, in addition to building trust and transparency, provides technical clarity by capturing nuances in data that may signal quality issues, including what the data represents and what is and is not included in the dataset. Importantly, this technical clarity enhances U.S. competitiveness because it improves trust in the U.S. AI data ecosystem, which currently faces challenges, such as gaps in datasets that impact an AI model’s accuracy. A clear data documentation process that captures data quality would therefore make the AI ecosystem more competitive with China’s compulsory approach because available data could be used more effectively.

CNAS Technology Lab participants proposed several steps that U.S. government agencies should take to define the processes surrounding data collection and data documentation. The following recommendations are informed by those ideas.

For data collection:

Congress should mandate studies on underexplored data privacy concepts, such as data minimization and data intermediaries. Instead of continuing a narrative of “more data is better” for innovation in the United States, Congress should mandate a study that explores methods for data minimization, a concept included in ADPPA that limits data collection of personal information to what is necessary and directly relevant to a given use case.¹⁷ This would help shift the focus of data privacy protections from the volume of data collected to the relevance of that data. Additionally, Congress should mandate a study that explores how the use of data intermediaries could help ensure that reprioritization in focus. The government of the United Kingdom describes data intermediaries as “a broad term that covers a range of different activities and governance models for organizations that facilitate greater access to or sharing of data.”¹⁸ Through these studies, Congress should factor in existing data minimization requirements in GDPR and the California Consumer Privacy Act to determine if they would scale across or be best suited for the United States. Likewise, Congress should assess the strengths and weaknesses of the proposed European Data Governance Act (DGA), which includes the creation of data intermediaries. Efforts to use data minimization and data intermediaries at the company level should be studied as well, considering these concepts can be voluntarily adopted.

The National Institute of Standards and Technology (NIST) should develop standards for data collection. NIST has effectively created voluntary and flexible oversight mechanisms for other sectors that integrate input from industry stakeholders. For example, the Cybersecurity Framework was first scoped by NIST to address critical infrastructure, but it has broadened over time with input from both the public and private sectors. Confirming its value, other countries, such as Israel, Italy, and Uruguay, have since adopted variations of the NIST framework.¹⁹ Consequently, a similar model should be applied to data collection. Given that public-private cooperation was key to the Cybersecurity Framework’s success, engagement with outside stakeholders (such as think tanks), and private industry throughout the data collection standard development process will be important for ensuring standards can be applied to all company sizes and levels of government.

Congress should mandate a study to explore how the United States should scope ethical practices for data storage and use. Beyond fears about how and what type of data is collected, consumers worry about where their data goes. Scoping ethical practices for data storage and use would therefore be a logical next step for promoting transparency and building consumer confidence in data management.

For data documentation:

Congress should direct NIST to conduct a review of agency data management practices. Working with the White House Office of Science and Technology Policy (OSTP), NIST should then use this review to inform the creation of a data documentation framework.

After NIST reviews agency data management practices, Congress should create and appropriate funding for workforce training programs focused on data documentation methodology. Data—and, by extension, the technology it powers—is only as good as the people who assess, integrate, and use it. As a result, workforce training programs would ensure data documentation processes are not only created but also implemented. These programs should be offered at every U.S. government agency.

Although data privacy can exist at both state and federal levels, it is not sustainable for comprehensive data privacy laws to continue advancing in a fragmented fashion. In addition to being cumbersome, confusing, and costly for companies to implement, disparate data privacy laws risk consumer rights, create consumer distrust, and hinder U.S. technological and economic competitiveness. Tackling data collection and data documentation processes will be a positive step forward for data privacy in the United States.

Acknowledgments

This CNAS Technology Policy Lab was made possible with the generous support of Schmidt Futures. CNAS also thanks all experts who participated in this Lab.

As a research and policy institution committed to the highest standards of organizational, intellectual, and personal integrity, CNAS maintains strict intellectual independence and sole editorial discretion and control over its ideas, projects, publications, events, and other research activities. CNAS does not take institutional positions on policy issues and the content of CNAS publications reflects the views of their authors alone. In keeping with its mission and values, CNAS does not engage in lobbying activities and complies fully with all applicable federal, state, and local laws. CNAS will not engage in any representational activities or advocacy on behalf of any entities or interests and, to the extent that the Center accepts funding from non-U.S. sources, its activities will be limited to bona fide scholastic, academic, and research-related activities, consistent with applicable federal law. The Center publicly acknowledges on its website annually all donors who contribute.

About the Technology Policy Lab

This policy brief is a product of the CNAS Technology Policy Lab, a working group structure designed to incubate solutions to crucial yet underdeveloped technology policy problems. Each lab is composed of subject matter experts from academia, industry, and the policy community collaborating to develop concrete recommendations to bolster U.S. national security interests and promote American competitiveness. We thank all experts who participated in this Lab.

Digital Decentralization and Its Effects on Democracy

According to blockchain technologists, web decentralization would fundamentally alter business, societies, and the balance of digital power by reorganizing the internet, using...

Dual-Use Technology and U.S. Export Controls

Technology is a key enabler of political, military, and economic power. As technical competence grows more diffused, middle powers such as India and Brazil are emerging as tec...

Endnotes

Kevin Chan and Haleluya Hadero, “Why TikTok’s security risks keep raising fears,” Associated Press, March 23, 2023, https://apnews.com/article/tiktok-ceo-shou-zi-chew-security-risk-cc36f36801d84fc0652112fa461ef140. ↩
Frederic D. Bellamy, “U.S. data privacy laws to enter new era in 2023,” Reuters, January 12, 2023, https://www.reuters.com/legal/legalindustry/us-data-privacy-laws-enter-new-era-2023-2023-01-12/. ↩
Ben Derico, “ChatGPT bug leaked users’ conversation histories,” BBC News, March 23, 2023, https://www.bbc.com/news/technology-65047304; and Lindsey Wilkinson, “Samsung employees leaked corporate data in ChatGPT: report,” Cybersecurity Dive, April 10, 2023, https://www.cybersecuritydive.com/news/Samsung-Electronics-ChatGPT-leak-data-privacy/647219/. ↩
Bellamy, “U.S. data privacy laws to enter new era in 2023.” ↩
Thorin Klosowski, “The State of Consumer Data Privacy Laws in the US (And Why It Matters),” The New York Times, September 6, 2021, https://www.nytimes.com/wirecutter/blog/state-of-privacy-laws-in-us/?partner=slack&smid=sl-share. ↩
Petroc Taylor, “Volume of data/ information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025 (in zettabytes),” Statista, September 8, 2022, https://www.statista.com/statistics/871513/worldwide-data-created/. ↩
“Governor Newsom, Attorney General Bonta and CPPA File Letter Opposing Federal Privacy Preemption,” Office of Governor Gavin Newsom, press release, February 28, 2023, https://www.gov.ca.gov/2023/02/28/governor-newsom-attorney-general-bonta-and-cppa-file-letter-opposing-federal-privacy-preemption/. ↩
“The History of the General Data Protection Rule,” European Data Protection Supervisor, May 17, 2023, https://edps.europa.eu/data-protection/data-protection/legislation/history-general-data-protection-regulation_en. ↩
Matt Burgess, “How GDPR Is Failing,” WIRED, May 23, 2022, https://www.wired.com/story/gdpr-2022/. ↩
Piotr Foitzik, “Different Approaches to Data Privacy: Why EU-US Privacy Alignment in the Months To Come Is Inevitable,” CPO Magazine, July 18, 2022, https://www.cpomagazine.com/data-privacy/different-approaches-to-data-privacy-why-eu-us-privacy-alignment-in-the-months-to-come-is-inevitable/; and Yan Luo et al., “How China’s draft SCCs compare with EU SCCs,” International Association of Privacy Professionals, July 21, 2022, https://iapp.org/news/a/how-chinas-draft-sccs-compare-with-eu-sccs/. ↩
“What is Data Management?” Oracle, May 17, 2023, https://www.oracle.com/database/what-is-data-management/. ↩
Lingling Wei, “U.S. Think Tank Reports Prompted Beijing to Put a Lid on Chinese Data,” The Wall Street Journal, May 7, 2023, https://www.wsj.com/articles/u-s-think-tank-reports-prompted-beijing-to-put-a-lid-on-chinese-data-5f249d5e. ↩
Matt Pottinger and David Feith, “The Most Powerful Data Broker in the World Is Winning the War Against the U.S.,” The New York Times, November 30, 2021, https://www.nytimes.com/2021/11/30/opinion/xi-jinping-china-us-data-war.html. ↩
National Counterintelligence and Security Center Director William Evanina, "Keynote Remarks as Prepared for Delivery," June 4, 2019, International Technical Association LegalSEC Summit 2019, Arlington, Virginia, https://www.dni.gov/files/NCSC/documents/news/20190606-NCSC-Remarks-ILTA-Summit_2019.pdf. ↩
Jake Sullivan, “Remarks by National Security Advisor Jake Sullivan at the National Security Commission on Artificial Intelligence Global Emerging Technology Summit,” July 13, 2021, The White House, https://www.whitehouse.gov/nsc/briefing-room/2021/07/13/remarks-by-national-security-advisor-jake-sullivan-at-the-national-security-commission-on-artificial-intelligence-global-emerging-technology-summit/. ↩
“Data Documentation,” The World Bank, May 17, 2023, https://dimewiki.worldbank.org/Data_Documentation. ↩
“Data Protection Glossary,” European Data Protection Supervisor, May 17, 2023, https://edps.europa.eu/data-protection/data-protection/glossary/d_en. ↩
“Unlocking the value of data: Exploring the role of data intermediaries,” UK Government Centre for Data Ethics and Innovation, July 22, 2021, https://www.gov.uk/government/publications/unlocking-the-value-of-data-exploring-the-role-of-data-intermediaries/unlocking-the-value-of-data-exploring-the-role-of-data-intermediaries#Section-1. ↩
“NIST Releases Version 1.1 of its Popular Cybersecurity Framework,” National Institute of Standards and Technology, April 16, 2018, https://www.nist.gov/news-events/news/2018/04/nist-releases-version-11-its-popular-cybersecurity-framework. ↩

Author

Alexandra Seymour

Former Associate Fellow, Technology and National Security Program

Alexandra Seymour was an Associate Fellow for the Technology and National Security Program at CNAS. Her work focuses on artificial intelligence, defense innovation, semiconduc...

Podcast
- February 4, 2025
The Just Security Podcast: Diving Deeper into DeepSeek
The DeepSeek saga raises urgent questions about China’s AI ambitions, the future of U.S. technological leadership, and the strategic implications of open-source AI models. How...

By Keegan McBride
Podcast
- February 3, 2025
The Implications of DeepSeek
When the Chinese artificial intelligence company DeepSeek unveiled its AI chatbot just weeks ago, it shook up the U.S. tech industry and set off an AI competition. DeepSeek sa...

By Jordan Schneider
Video
- January 31, 2025
The Brute Force Method for Training AI Models Is Dead, Says Full-Stack Generative AI CEO May Habib
Full-Stack Generative AI CEO May Habib and Jordan Schneider, adjunct fellow in the Technology and National Security Program, join 'Power Lunch' to discuss Nvidia, Singapore an...

By Jordan Schneider
Podcast
- January 31, 2025
DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!
ChinaTalks’ Jordan Schneider, adjunct fellow of the Technology and National Security Program at the Center for a New American Security, joins to explain the Chinese A.I. indus...

By Jordan Schneider

View All Reports View All Articles & Multimedia

Publications

Research Areas

Resident Experts

Adjunct Experts

Who We Are

CNAS Programs

Press

Events

Connect

Data Privacy

Acknowledgments

About the Technology Policy Lab

Digital Decentralization and Its Effects on Democracy

Dual-Use Technology and U.S. Export Controls

Endnotes

Author

Alexandra Seymour

Acknowledgments

About the Technology Policy Lab

Digital Decentralization and Its Effects on Democracy

Dual-Use Technology and U.S. Export Controls

Endnotes

Author

Alexandra Seymour

More from CNAS

Get the Latest from CNAS

Sign up for weekly updates and analysis on the most important issues in U.S. national security.