Literature Review - The ICEberg Effect

Introduction

This project sits at the intersection of three bodies of literature: the sociology of immigration enforcement, the computer science and statistics of dataset bias, and the emerging critical data studies scholarship on administrative systems and state power. What follows is a structured review of the most relevant work in each domain, followed by a synthesis of how our contribution relates to and extends the existing literature.

I. The Criminal Alien Program and database-driven enforcement

The most directly relevant empirical predecessor to this project is Gardner and Kohli's "The C.A.P. Effect: Racial Profiling in the ICE Criminal Alien Program" (Warren Institute, UC Berkeley, 2009). Analyzing arrest data from Irving, Texas during the 23-month period surrounding the city's adoption of CAP, Gardner and Kohli documented a dramatic increase in discretionary arrests of Hispanic residents for minor offenses, particularly traffic violations, after CAP gave local police 24-hour ICE access. Their central finding was that only 2% of those detained by immigration authorities in a 14-month period had received felony charges. They concluded that CAP was not only failing to target serious criminal offenders but was tacitly encouraging local police to arrest Hispanic residents for petty offenses, effectively using minor criminal justice contact as a gateway into immigration enforcement.

This finding establishes the mechanism our project formalizes at national scale: the CAP pipeline generates criminality labels not because of the severity of the underlying offense, but because of the decision to query criminal databases in certain communities. Gardner and Kohli show this at the local level through before-after analysis; we show the same structure at the national level through individual-level audit of 713,464 records across the 2022-2026 enforcement period.

The American Immigration Council's 2014 report "Enforcement Overdrive: A Comprehensive Assessment of ICE's Criminal Alien Program" extends this analysis nationally. Using DHS enforcement data through 2013, the Council documented that the proportion of Mexican nationals removed through CAP exceeded their share of the foreign-born noncitizen population by 39 percentage points. They flagged this overrepresentation as raising serious concerns about whether racial disparities among arrests that lead to immigration enforcement might be generalized practices rather than local anomalies. Critically, they also note geographic variation: states with the highest CAP removal rates per 1,000 noncitizens include Mississippi, Wyoming, West Virginia, Kentucky, and Texas, states that are not simply border or high-immigration states, suggesting that the enforcement intensity reflects local enforcement infrastructure rather than immigration demographics.

Our project updates and extends these findings into the current enforcement era, using more granular individual-level data and a more formally specified analytical framework. We show that the nationality-pipeline correlation documented by the Council persists in the 2022-2026 period and can be connected, through regression analysis, to differential deportation outcomes within equivalent legal categories.

II. Crimmigration and the structural targeting of Latino men

The sociological literature on "crimmigration", the convergence of the criminal justice and immigration enforcement systems, provides the institutional context for understanding why the CAP pipeline produces the patterns we observe.

Julien Stumpf's foundational 2006 article "The Crimmigration Crisis" in the American University Law Review first theorized the structural merger of criminal and immigration law, arguing that immigration violations had been progressively criminalized, creating a legal architecture in which prior criminal justice contact became a gateway to immigration consequences regardless of whether the original offense had any connection to immigration law. This is precisely the mechanism operating in the CAP pipeline: any database contact, however minor, can trigger an immigration enforcement response.

Tanya Golash-Boza's 2015 book Deported: Immigrant Policing, Disposable Labor, and Global Capitalism (NYU Press) provides the most comprehensive empirical account of the mass deportation era. Golash-Boza documents that fewer than half of deportees in the mass deportation period had criminal records, and that most of those who did were convicted of minor traffic violations or border-crossing offenses rather than serious crimes. The figure she and Hondagneu-Sotelo (2013) report, that 97% of deportees were Latino and 90% were men, is the direct population-level manifestation of the pipeline targeting we document at the individual level. Our finding that Mexican nationals are arrested via CAP at 40.7% of their cases while Peruvian nationals are arrested at 10.8% is a precise quantification of the differential that Golash-Boza describes qualitatively.

Amada Armenta's 2007 study of the Nashville Police Department's adoption of 287(g) shows the local mechanism of this targeting: faced with pressure to meet departmental arrest quotas, officers aggressively stopped drivers for minor traffic violations, which then triggered immigration enforcement screenings. This observation directly corroborates the CAP mechanism: the database query does not require a serious offense to initiate; it requires only contact. Our finding that the criminality label is largely consistent between apprehension and case resolution (98.3% identical) may reflect exactly this: the initial label, generated by the database contact, is not subsequently revised because there is no independent review mechanism.

The Russell Sage Foundation review by Denvir and colleagues (2025, "Criminalization of Immigration") provides the most current synthesis of this literature, noting that studies have not found empirical support for the claim that CAP and 287(g) reduce crime, and that the dominant finding across the literature is instead racial disparity in targeting, particularly of Latino communities, regardless of the severity of underlying legal violations.

III. Dataset bias and measurement failure in high-stakes AI

The technical framing of our project connects to a rich literature on how apparently neutral administrative variables encode bias through their measurement process.

Obermeyer, Powers, Vogeli, and Mullainathan's 2019 Science paper "Dissecting racial bias in an algorithm used to manage the health of populations" is the canonical reference for the type of analysis we conduct. Obermeyer et al. showed that a widely-used healthcare algorithm systematically assigned lower risk scores to Black patients than to equally sick white patients. The source of the bias was not an explicit racial variable but a proxy: the algorithm used prior healthcare costs as a measure of health needs. Because Black patients with equivalent health conditions had historically lower healthcare costs (due to unequal access), the algorithm systematically underestimated their needs. The key methodological insight is that the measured variable (cost) does not correspond to the latent construct it claims to measure (health need) in the same way across demographic groups. This is the construct validity failure that our project identifies in the apprehension_criminality variable: it does not measure the same underlying legal severity across pipelines, because different pipelines surface different populations through different database query logics.

Friedler, Scheidegger, and Venkatasubramanian's 2016 paper "On the (Im)possibility of Fairness" and the follow-up 2019 FAccT paper "A comparative study of fairness-enhancing interventions in machine learning" establish the theoretical framework for understanding how measurement bias propagates through classification systems. Their concept of "worldview", the assumption that observed scores reflect underlying constructs, is precisely what our project interrogates. The ICE threat scoring system assumes that the threat level variable reflects actual individual threat. We show it partly reflects data collection history.

The broader survey by Ahmad, Vallès, and Idaghdour (Frontiers in Big Data, 2026) identifies six major bias types in AI systems: algorithmic, confounding, implicit, measurement, selection, and temporal. Our project engages primarily with measurement bias (the criminality label as contaminated measure) and selection bias (only arrested individuals appear in the dataset), but the temporal dimension, the dramatic 2022-2024 shift in pipeline composition, maps onto their temporal bias category.

Suresh and Guttag's 2021 paper "A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle" (ACM) provides a taxonomy that is also relevant: they distinguish historical bias (bias in the world before data collection), representation bias (bias in how the population is sampled), and measurement bias (bias in how features are operationalized). The CAP pipeline embeds all three: historical bias in who has criminal justice records, representation bias in which communities are screened, and measurement bias in how the screening outputs are labeled.

IV. Critical data studies and administrative legibility

James Scott's Seeing Like a State (1998, Yale University Press) provides the foundational theoretical vocabulary for our project. Scott argues that states require populations to be "legible", simplified, standardized, and categorized in ways that make administrative intervention possible. The production of legibility is not a neutral descriptive act; it involves the creation of categories that then shape the populations they describe. The criminality classification in ICE's dataset is a legibility mechanism: it reduces a complex legal situation to three categories that then determine enforcement trajectory. Our finding that the category is partly produced by the measurement process (the database query) rather than reflecting a pre-existing legal reality is a direct empirical demonstration of Scott's theoretical claim.

Virginia Eubanks' Automating Inequality (2018, St. Martin's Press) extends this framework into the digital era and provides the most direct conceptual predecessor to our analysis. Eubanks documents how the Allegheny County Family Screening Tool, Indiana's welfare eligibility system, and Los Angeles's homelessness allocation algorithm each create what she calls a "digital poorhouse", not through any single discriminatory decision but through the accumulation of individually defensible administrative procedures that systematically amplify disadvantage for poor and minority communities. The pipeline-label-score-program chain we document in ICE enforcement data is an exact instance of this mechanism: no single step is obviously discriminatory; the bias emerges from the architecture of the system.

Kate Crawford's Atlas of AI (2021, Yale University Press) connects this to the broader political economy of data infrastructure, arguing that AI systems are not neutral tools but "a registry of power" that encodes existing social hierarchies into technical categories. Her analysis is relevant to our records falsification concern: Crawford shows that the apparent objectivity of administrative data conceals the political decisions embedded in how data is collected, categorized, and used. The near-perfect consistency between apprehension and case criminality in our dataset (98.3%) may reflect not accuracy but institutional inertia: the category produced at the database query stage is reproduced through the system without independent challenge.

Safiya Umoja Noble's Algorithms of Oppression (2018, NYU Press) provides the framework for understanding how apparently neutral search and classification systems reflect and reproduce existing power relations. While Noble's focus is on commercial search algorithms, her core argument, that the categories embedded in technical systems are not neutral descriptions but political choices, applies directly to the criminality classification system in ICE's data.

Ben Green's work on "The Flaws of Policies Requiring Human Oversight of Government Algorithms" (Harvard Data Science Review, 2022) is directly relevant to our records falsification concern. Green shows that human oversight of algorithmic outputs frequently amounts to rubber-stamping rather than independent review, producing the pattern of consistency across classification stages that we observe in the 98.3% mismatch-free records. If officers review CAP-generated criminality labels but defer to the database output, the label is not independently validated, it is reproduced.

V. Data transparency, FOIA, and the politics of enforcement data

The dataset we analyze is itself a product of a contested political process. The Deportation Data Project, which obtained these records through FOIA requests, has publicly noted concerns about data reliability. Their documentation states: "We continue to have questions about the reliability of the encounters and removals data. These encounters and removals datasets are similar to the October versions but different from the late July release that we believe was most reliable." This is not a minor caveat, it is the data provider flagging inconsistencies across ICE data releases that suggest possible manipulation or selective reporting.

The broader context of FOIA resistance by ICE during the 2025-2026 period is relevant to interpreting our findings. Reporting by the Columbia Journalism Review in September 2025 documented that nearly two dozen reporters had filed FOIA requests with ICE that were routinely denied or ignored, with ICE increasingly citing "ongoing law enforcement investigations" to refuse disclosure. The data we analyze was produced before this heightened resistance; the records from the 2025 period in our dataset may reflect a different disclosure regime than earlier records.

The legal framework for accessing ICE data was significantly clarified in the Second Circuit's ruling in ACLU v. ICE (American Immigration Council, 2022), which required ICE to provide unique identifiers in FOIA responses to allow cross-table linkage of enforcement records. This ruling has made the kind of individual-level tracking that the DDP dataset enables possible in a way that was not legally guaranteed before. Our analysis is possible because of this legal infrastructure.

Traci Burch and colleagues at TRAC (Transactional Records Access Clearinghouse, Syracuse University) have been the most sustained producers of quantitative analysis based on ICE FOIA data, documenting enforcement trends, court outcomes, and detention statistics. Our project is methodologically consistent with their approach, using FOIA-obtained administrative records as the primary empirical source, but asks a different question: not what the enforcement patterns look like, but what the data itself encodes about how those patterns are produced.

VI. What this project contributes that is new

Having reviewed the existing literature, the specific contributions of this project can be stated precisely:

1. Scale and recency

Prior empirical work on CAP bias (Gardner and Kohli 2009, AIC 2014) used data from the 2006-2013 period and was often localized to specific cities or states. We analyze 713,464 individual records from 2022-2026, covering the most intense enforcement expansion in the post-Obama era. The patterns have not been quantified at this scale for this period.

2. The pipeline-label-outcome chain

Prior work documented either (a) that CAP was not targeting serious criminals, or (b) that enforcement showed racial disparities in outcomes. We connect these two observations through a formal regression chain showing that the data pipeline is the mechanism: it shapes the label, the label shapes the score, the score shapes the program, the program shapes the outcome. This causal chain has not been formally specified and tested in prior literature.

3. Construct validity framing

Prior enforcement research in sociology typically frames disparities as evidence of racial profiling or institutional racism. While these are valid interpretations, they are difficult to test from aggregate data. The construct validity framing, the apprehension_criminality variable fails to measure the same construct across pipelines, is a more precise, falsifiable claim that connects to the algorithmic fairness literature and is directly testable with the individual-level data.

4. Dataset audit as contribution

This project audits a publicly available dataset used by journalists, advocates, policymakers, and researchers. The finding that the apprehension_criminality variable is contaminated by apprehension_method is directly relevant to anyone who uses this dataset in downstream analysis. This is a contribution to the usability of the DDP dataset, not just to the understanding of ICE enforcement.

References

Ahmad, A., Vallès, Y., & Idaghdour, Y. (2026). Bias in AI systems: Integrating formal and socio-technical approaches. Frontiers in Big Data, 8, 1686452.

American Immigration Council. (2014). Enforcement Overdrive: A Comprehensive Assessment of ICE's Criminal Alien Program. Washington, D.C.

Armenta, A. (2007). From sheriff to city police: Local immigration enforcement in Nashville and the impact on the undocumented community. PhD dissertation, University of California, Santa Barbara.

Crawford, K. (2021). Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press.

Denvir, D. et al. (2025). Criminalization of Immigration. RSF: The Russell Sage Foundation Journal of the Social Sciences, 11(3), 282-314.

Deportation Data Project. (2026). ICE enforcement data documentation. deportationdata.org/data/ice.html

Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin's Press.

Friedler, S.A., Scheidegger, C., & Venkatasubramanian, S. (2016). On the (im)possibility of fairness. arXiv:1609.07236.

Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., et al. (2019). A comparative study of fairness-enhancing interventions in machine learning. Proceedings of FAT*, pp. 329-338.

Gardner, T.G., & Kohli, A. (2009). The C.A.P. Effect: Racial Profiling in the ICE Criminal Alien Program. Chief Justice Earl Warren Institute on Race, Ethnicity and Diversity, UC Berkeley School of Law.

Golash-Boza, T. (2015). Deported: Immigrant Policing, Disposable Labor, and Global Capitalism. NYU Press.

Golash-Boza, T., & Hondagneu-Sotelo, P. (2013). Latino immigrant men and the deportation crisis: A gendered racial removal program. Latino Studies, 11(3), 271-292.

Green, B. (2022). The flaws of policies requiring human oversight of government algorithms. Harvard Data Science Review, 4(2).

Noble, S.U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447-453.

Scott, J.C. (1998). Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press.

Stumpf, J. (2006). The crimmigration crisis: Immigrants, crime, and sovereign power. American University Law Review, 56(2), 367-419.

Suresh, H., & Guttag, J. (2021). A framework for understanding sources of harm throughout the machine learning life cycle. Proceedings of ACM FAccT.

Literature review.