Limitations of Satellite Imagery Analysis for AI-Specific Data Centers
by Konstantin Pilz & Lennart Heim
BLUF: Satellite imagery of data centers provides limited immediate value for AI-specific insights due to several challenges: AI hardware constitutes only a small portion of data center equipment, often coexists with non-AI hardware, lacks reliable visual identifiers, and its computational power is difficult to estimate from external observations. Continuous monitoring and cross-correlation with other sources may improve insights, but AI-specific information will likely remain limited.
While detecting data centers via satellite imagery can yield valuable information about their size and location, some limitations are often overlooked. The immediate value for AI policy is not as straightforward as it might seem due to a lack of critical insights into AI-specific aspects of data centers. We find the following challenges:
- AI hardware constitutes only a small portion of data center equipment. As of 2024, most data center space still hosts general-purpose compute. For instance, EPRI estimates that AI applications currently account for only about 10% to 20% of all data center power consumption (which likely includes non-frontier AI workloads). And according to TrendForce, only about 12% of all servers shipped in 2024 will be AI servers.
- AI hardware often shares space with other equipment in data centers. While there are facilities dedicated to AI supercomputers (e.g., the Jülich Supercomputer and xAI's Supercluster in Memphis), large tech companies frequently utilize existing data center campuses to host AI compute. For example, GPT-4 was reportedly trained in a Microsoft data center in Iowa, where the company maintains several large facilities that are much too expansive to accommodate only the approximately 20,000 A100s used for GPT-4's training. Particularly for inference workloads, dedicating entire facilities to AI is unnecessary, and there may be interest in retrofitting existing data centers to accommodate AI hardware alongside traditional hardware.
- Visual identification of AI hardware presence is unreliable. While AI chips have a higher power density and thus require more cooling infrastructure per square foot, this is still only a coarse proxy for whether a facility actually hosts AI-specific hardware. Otherwise, AI data centers don’t seem to have any features that allow you to visually distinguish them from data centers hosting other compute (as of now). Future build-outs with even larger numbers of AI chips might exhibit more unique characteristics, potentially making identification easier.
- Estimating a data center's computational (AI) performance is complex and imprecise. Even when a data center is known to host AI hardware, accurately assessing the quantity and type of AI chips remains challenging. Attempts to estimate power consumption based on external structures or energy data and subsequently infer computational performance yield unclear and imprecise results.
Therefore, these challenges significantly limit the immediate value of satellite imagery for AI-specific insights. Identifying the locations of all data centers differs substantially from pinpointing AI-specific facilities. Moreover, even when AI data centers are located, this information does not necessarily provide accurate insights into their computational power, given the uncertainty surrounding the quantity and types of chips they employ.
Way Forward
While a general data center mapping project still offers valuable insights and is a good initial step, we recommend two considerations based on the above challenges, ideally incorporated from the start:
- Implement continuous monitoring of historical and ongoing development. Tracking data center development over time would significantly enhance the value proposition. As AI consumes an increasingly larger share of compute resources, correlating new construction with AI-specific needs will become more feasible.
- Cross-correlate findings with other intelligence sources. Combining the satellite imagery analysis with other open-source intelligence, such as records of investment, permitting, and construction of AI-related data centers, could yield more detailed insights about specific clusters and their relevance for AI development and deployment.