Paleomobility in Iberia: 12 years of strontium isotope research

We examine the quantity, contextual quality and spatial distribution of the data gathered in the strontium isotope archaeological database for the Iberian Peninsula and the Balearic Islands, an open source available as part of IDEArq (www.idearq.org), the spatial data infrastructure hosted at the Instituto de Historia (CSIC). It has to date 1635 87 Sr/ 86 Sr values from human and animal samples, and environmental proxies recovered during archaeological research. The data has allowed us to produce the first regional and peninsular strontium isoscapes for Iberia. We discuss the benefits and limitations of approaching mobility in Iberia through 87 Sr/ 86 Sr isotope analysis and suggest directions for future collaborative research.


Introduction
The subject of movement, mobility and migration has been highlighted as one of the five grand challenges for contemporary archaeological research (Kintigh et al., 2014: 12). The use of strontium isotope analysis is nowadays one of the most effective methods to face this challenge, offering a systematic, quantitative and comparable approach to the mobility of past human and animal populations (Larsen 2018). With advantages that outweigh well-known drawbacks, as has been effectively summarized by Scaffidi and Knudson (2020), 87 Sr/ 86 Sr ratios from a wide variety of locations and chronologies offer the opportunity for period-based, transhistorical or cross-cultural approaches to paleomobility. Together with aDNA studies (Gokcumen and Frachetti, 2020), strontium isotopes have transformed the way archaeologists and historians engage in the study of human and animal mobility, giving yet another scientific dimension to our knowledge of past social and economic practices.
Population mobility has played an important explanatory role in the interpretation of Iberian historic and prehistoric dynamics. Although mostly a result of a traditionally dominant culture-historical background, it is also a reasonable consequence of both Iberia's history and location at the southwestern tip of the Eurasian continent, closing the only Mediterranean access to the Atlantic and with a coastline<15 km away from the North African shore. Not surprisingly, north-south contacts and long-distance migrations have been a cornerstone for many interpretations, particularly -but not only-of prehistoric phenomena, such as the arrival of Neolithic traits (Zilhão, 2001), the Copper and Bronze Age eastern Mediterranean 'colonies' (Blance 1961), or the bellbeaker expansion and interaction spheres (vid Martínez-Navarrete, 1989: 311-316;Lillios 2020). Regional mobility, such as transhumance and other kinds of herd management, has also been part of the explanation for the distribution of megaliths or the expansion of central Iberia's Bronze Age Cogotas 'culture' (a critical review in Chapman 1979; Blanco and Esparza 2019 respectively). Human mobility, never really forgotten by Spanish and Portuguese prevailing culture-history approaches, is back on the board again, now with a stronger sciencebased archaeological background. It is thus important to approach critically the nature, quality and interpretative potential of the existing data in order to build the foundations for a comparative framework on human mobility in Iberia (Díaz-del-Río, 2020: 45-47;Lillios 2020).
This paper explores the state of the art of 87 Sr/ 86 Sr isotopic values obtained from geolocated human and faunal archaeological samples and related environmental proxies (plants, water, and modern soil) obtained in the context of archaeological research from the Iberian Peninsula and the Balearic Islands published until 2021. It covers 12 years of research, beginning in 2009 with the publication of the first set of strontium data from the Tholos of Palacio III (Almadén de la Plata, Seville) (Díaz-Zorita Bonilla et al. 2009).
We provide the first assessment of over 1.600 87 Sr/ 86 Sr values recovered in the context of archaeological research from 86 archaeological sites ranging from the Neolithic to Medieval times and related environmental proxies (Fig. 1, supplementary material Table 1), assessing the quantity and quality of the sample, its location and distribution by periods, species, sex and age. Finally, we offer the first strontium base map for Iberia based on this sample and suggest strategies for future research.
All this data is available at IDEArq (www.idearq.org), the Spatial Data Infrastructure for the online publication of Iberian repertories of georeferenced archaeological information hosted at the Instituto de Historia (Spanish National Research Council, CSIC). The platform currently provides access to three data sets. IDEArq-C14 includes over eleven thousand radiocarbon dates from the Iberian Peninsula and the Balearic Islands Vicent, 2016, Uriarte et al. 2017). IDEArq-CPRL displays the Corpus of Levantine Rock Art (Cruz Berrocal et al. 2005). Finally, IDEArq-δimP includes isotopic values of strontium, oxygen, carbon and nitrogen obtained from human and animal remains recovered from archaeological contexts of the Iberian Peninsula and the Balearic Islands.
Samples include human and faunal remains from 95 archaeological sites and environmental proxies obtained and published in the context of archaeological research. Humans have been the main sampled species (supplementary material Table 2), as human mobility has been the main object of research. Ongoing extensive research programs on domestic animal mobility (e.g. Valenzuela-Lamas et al. 2018) have substantially increased reported faunal samples, previously mostly used as a contrast of human 87 Sr/ 86 Sr home ranges. As has happened in most other countries, plants, soil and water have only been recently introduced as part of archaeological approaches to past mobility. This is the case for the environmental baseline for Portugal (James et al. 2022).
The Late Prehistory (Neolithic to Bronze Age) accounts for up to 87 % of the total sampled sites and up to 82 % of the samples. While Iberia has abundant and well-preserved Late Antiquity and Medieval burial grounds, and the scale and role of incoming groups throughout these periods has been demonstrated by recent findings (e.g. Gleize et al. 2016), the impact of 87 Sr/ 86 Sr isotope analysis is still relatively moderate when compared to previous periods or other European regions (e. g. Amorim et al 2018;Francisci et al 2020).
Over half of the human samples have been recovered in Copper Age contexts (c. 3200-2200 cal BC), mostly from collective burials (supplementary material Table 3). But again, the size of the sample stands in stark contrast with the relatively low collection of sampled Bronze Age burials, a period known in much of Iberia for its individual burial practices, some of which are highly emblematic such as those from the southeastern Argaric group (e.g. Lull 2000;Lull et al. 2021). Faunal remains have a very similar chronological distribution, mostly selected as a proxy for human mobility home ranges (supplementary material Table 4). Behind all these regional and chronological differences are diverse research groups with their own individual research strategies. For example, the bulk of the data for Marroquíes (Jaén, Spain) (Díaz-Zorita Bonilla et al 2018) and Perdigões (Évora, Portugal) (Valera et al 2020), two of the most sampled sites in Iberia, was obtained in order to understand population aggregation dynamics at these mega-sites. All the data recovered from Central Iberia has been obtained in the context of one research program oriented to assess the traditionally assumed mobility of their late prehistoric populations (e.g. Díaz-del-Río et al., 2017b). That is, research interest is the main reason behind the temporal and geographical distribution of the published sample.
Two broad groups of archaeological contexts can be distinguished (Fig. 3, supplementary material Table 5). The first are pit burials, almost 40 % of the total human sample, a category that includes from medieval single graves to cists and all prehistoric pit burials. The rest are mostly collective burials of different nature: megalithic structures, tholoi, artificial and natural caves. The fact that over 61 % of the samples are commingled burials of this second group has a direct effect on another relevant aspect of the sample such as sex estimation.
Almost 60 % of the isotopic values are not assigned to a specific anthropological sex (supplementary material Table 6), even though half of them (48 %) are adults. This is largely the effect of the weight of commingled burials in the sample, a context where anthropological sex estimation is radically underrepresented.
Other than preservation issues, both criteria and quality of judgment depend on the bioanthropologist in charge of the estimation, and their effects will particularly affect cumulative databases that mash-up results of multiple specialists. Out of those assigned in our database to anthropological sex, 55 % are male or possibly male, and 45 % female or possibly female. This imbalance of male/female estimations in favour of the former has been recently highlighted for published Iberian prehistoric collections (Cintas and Herrero 2020). When the possibility of comparing anthropological and genetic sex exists, as in Villalba-Mouco et al. (2021), the number of failed matches seem relatively low, six out of 93 identified cases. But out of those that failed, 5 anthropologically identified males were genetically female. Although a chi-squared test of this sample shows that the difference is not statistically significant, its pvalue (0.077) suggests that it may be bound to change when more data becomes available.
Estimating the age of death is also very much dependent on the criteria, knowledge, practice and skills of each individual analyst. Overall, adults have been the preferred choice for most researchers when sampling for 87 Sr/ 86 Sr analysis (64 %, Supplementary material Table 7). This is hardly surprising, as the probability of locational displacement increases with age. Nevertheless, 18 % of the sample are non-adults (<18 years old), while another 18 % are simply unknown. Keeping a stratified sampling strategy of age groups is the only way of recognizing the statistical but also cultural significance of pre-adult mobility. This is for example the case of the Copper Age site of Marroquíes, where non-local individuals, adult and non-adults, were buried without any formal distinction as part of the local community, thus suggesting inclusive social mechanisms among those living at the megasite (Díaz-Zorita Bonilla et al., 2018).
Most human samples have been obtained from dental enamel. A considerable 24 % are reported as 'molars' and 5 % are simply not determined. That is, over one fourth of the dental samples have not benefited from what are nowadays basic reporting standards. Nevertheless, in recent years authors have made an increased effort to report better and more detailed information of the analyzed samples.
An 11 % of the human sample is referred to as "bone" (see supplementary material Table 8). Unlike dental enamel, bone remodels through lifetime and does not necessarily reflect the local isotopic signature of childhood. Furthermore, the porous qualities of most human bones make them a better evidence of the soil Sr signature from what they were recovered than of the individual's possible mobility.

The distribution of the sample
When mapped, the analysed sample offers a variable and discontinuous distribution throughout Iberia (Fig. 4a). Regional concentrations are mostly associated to investigations produced by specific individuals or research groups. This is the case for areas such as the Lisbon peninsula (Waterman et al., 2014), the lower Guadiana basin (Valera et al. 2020), the upper Tagus and its northern tributaries (Díazdel-Río et al., 2017b) or the upper Ebro basin (Sarasketa-Gartzia et al. 2018). All of them have large enough sample size to allow for a reasonably robust approach to local/regional paleomobility (e.g. Fernández-Crespo et al., 2020b).
Except for northwestern Iberia, where soil acidity may sometimes affect bone preservation, the observed distribution is unrelated to the availability of human or faunal samples. In fact, there is a significant lack of analysed remains from the southeastern region, an area known for the quantity and preservation quality of its Copper and Bronze Age burials. This is most likely due to previous research priorities, such as the refinement of the chronology of Copper Age megalithic cemeteries (Aranda et  Overall, only two areas have data enough, albeit sometimes limited, to consider a trans-regional approach to paleomobility: a transect from the Lisbon peninsula through the lower Guadalquivir river valley in southwest Iberia, and a second and most complete of all running throughout the left margin of the river Ebro, in the northeast part of the peninsula. This patchy distribution is even clearer when mapping the sample by chronology, highlighting some obvious limitations when undertaking time-period specific research at an Iberian scale. Although the Copper Age has the highest number of analysed samples (764) (Fig. 4d), all the remaining periods have geographical distributions that preclude interregional approaches. This is the case for the Late Antiquity/Medieval periods (223 samples), the Neolithic (1 8 3) or the Bronze Age (1 6 5) (Fig. 4c, 4e, 4f). The Iron Age has to date the least analysed samples, all 60 of them from faunal collections. The lack of human samples (e.g. Torres-Martínez et al. 2021) is mostly a consequence of the generalization of cremation as a funerary rite and the limited preservation of teeth. Finally, the only 6 Mesolithic samples belong to the same individual, a young woman buried at the Aizpea rock shelter (Fernández Crespo et al., 2020b) and is thus anecdotal.
The described archaeological 87 Sr/ 86 Sr database has been used to elaborate a set of predictive isotope models, known as isoscapes, in a similar manner as the baselines produced for other regions (e.g. Scaffidi and Knudson 2020, Willmes et al. 2018, Frank et al. 2021). An isoscape consists of a continuous map of isotopic values (in this case, 87 Sr/ 86 Sr values) across a certain geographical area. This continuous surface is generated by means of interpolation, that is, "the procedure of predicting the value of attributes at unsampled sites from measurements made at point locations within the same area or region" (Burrough and McDonnell, 1998: 98).
We have followed with slight variations the geostatistical methodology suggested by Scaffidi and Knudson (2020) for the Andes, in particular the application of the so-called kriging modelling, performed with ArcGIS software (v. 10.4) and using as coordinate system UTM 30 North, ETRS89 Datum. Kriging is a procedure based on geostatistical interpolation rather than in other deterministic methods. The later (e.g. IDW or splines) calculates fixed predicted values through the application of pre-stablished algorithms, while the former is based on the generation of a predictive model obtained from the analysis of the data structure in itself. It should be highlighted that this method provides a measure of the accuracy of estimated values as confidence intervals (Scaffidi and Hudson, 2020: 4). Its practical importance lies in that it offers a statistical and spatially defined measure of the provisionality of results, and may serve as a guide for future research.
Isoscapes were produced for two different areas, applying similar geostatistical interpolation processes to both datasets. The first covers the whole Iberian Peninsula. The second approaches Central Spain, an area selected because of the density of data and its lithological variability. Needless to say, these isoscapes are inevitably provisional and tentative. This is due to the spatially biased nature of the 87 Sr/ 86 Sr values dataset, with significant gaps and clusters. The aim is not to provide a finished cartography, but to display the generation methodology, its potentialities and limitations.
As Scaffidi and Knudson (2020) suggest, our isoscapes are based on 87 Sr/ 86 Sr values obtained from samples recovered mostly from archaeological contexts. According to these authors, values taken from present landscape materials, known as baseline materials (plants, soil), frequently show drawbacks for provenance studies that focus on human/animal paleomobility. This is so because of the potential and not always evident cumulative effect of natural and especially human landscape transformations, most prominently agricultural practices. As the authors point out (Scaffidi and Knudson 2020: 2), actual landscape 87 Sr/ 86 Sr values can in fact differ significantly from both past landscapes and the underlying geological bedrock. Bodily tissue 87 Sr/ 86 Sr values are somehow an average of the isotopic catchment of the diet and, as such, reflect the geological and atmospheric contribution to a given location. That is, adequately filtered values recovered from archaeological contexts can be the most effective proxies of past local 87 Sr/ 86 Sr values.
Nevertheless, we have separately taken into account other 87 Sr/ 86 Sr environmental samples obtained in the context of archaeological research.
Although not yet generalized, some researchers have sampled plants, snails and occasionally water from the surroundings of archaeological sites and, sometimes, from more distant locations in order to obtain contemporary baselines. In some regions, such as northeaster Iberia, these samples cover voids lacking archaeological samples, and are helpful when broadly modeling archaeological signatures.
In order to be able to contrast independently archaeological and environmental signatures we have generated two different isoscapes for Iberia: one that strictly uses samples from archaeological contexts and a second that includes both archaeological and environmental samples.
Steps for modeling both isoscapes have been the following: • Definition of the dataset, selection of 87 Sr/ 86 Sr values located in the study area excluding all those not reported or with inaccurate coordinates (>1 km radius uncertainty). • Preliminary analysis of data sample distribution. Examination of the spatial pattern by means of Nearest Neighbor Analysis (NNA) 1 and evaluation of the presence/absence of spatial autocorrelation with Moran's I index. 2 NNA helps to determine if our data is homogeneously distributed across the study area, assessing its representativeness, and highlighting informational gaps. Spatial autocorrelation is a crucial concept when building interpolations, particularly in kriging. A set of values is said to be spatially autocorrelated when there is some kind of systematic relationship between them and the distance between their locations (Haining, 2001). Our assumption is that close-by 87 Sr/ 86 Sr values should have a positive spatial autocorrelation, that is, will tend to be more similar to each other.
• Identification and exclusion of spatial outliers. These values are exceptionally different from those in their surroundings, whether exceedingly high values surrounded by low values or vice versa. These samples cannot be considered representative of the local landscape and would produce distorting effects in the kriging interpolation. Anselin's Local Moran's index 3 has been applied to identify spatial outliers. • Production of the isoscape by means of kriging interpolation, performed with the Geostatistical Analyst tool ArcGIS, generating two maps that should be interpreted together. These represent: 1. the isoscape itself, that is, the interpolated surface with 87 Sr/ 86 Sr predicted values, and 2. the surface with the standard errors of predictions, a measure of their accuracy: the higher the standard error, the less accurate the prediction.

An isoscapes map for the Iberian Peninsula
In order to produce an isoscape for the Iberian Peninsula we have used all 1533 87 Sr/ 86 Sr values with available coordinates, mostly from archaeological contexts (n = 1369). Two kriging interpolations have been produced in order to contrast the weight of baseline values in the result, with and without environmental proxies.
In both cases, the Average Nearest Neighbor Analysis confirms the apparent clustered pattern observable in the map, 4 while Moran's I index shows a strong positive spatial autocorrelation. 5 Table 9a and 9b and prediction and standard error maps are shown in Fig. 5.
The resulting isoscape largely reflects the three main lithological regions of the Iberian Peninsula: the higher radiogenic granitic lithologies of the west, the Catalan Coastal Range and the Pyrenees, the predominantly limestone regions of the east with the lowest radiogenic values, and the clayey soils of the sedimentary basins showing intermediate values. The overall 87 Sr/ 86 Sr range for Iberia (0.7046-0.7207) reflects this variability and is similar to results obtained for neighbouring France (0.7046-0.7218) (Willmes et al. 2018).
We are obviously far from being able to produce a detailed base map for the Iberian Peninsula. This map can only be a work in progress. The Northern Meseta and most of the Atlantic façade, north from the Lisbon peninsula all the way to the border of the Basque Country, remain void of archaeological samples, while sample density is low even in the most intensely investigated areas such as central Iberia. Future data covering the existing voids will increase and define what to date are areas with a high standard error.
The incorporation of non-archaeological proxies, with all the cautions of the potential effects of agricultural systems in the isotopic values (crop history, manuring, etc), does allow for a more nuanced patterning, as the northern Ebro basin clearly shows. This combination of archaeological and environmental proxies can also allow for a geographical and lithological oriented sampling design that can cover areas that for different reasons (soil acidity, lack of research, etc) have no available archaeological samples. Such an approach should be based on a twophase strategy in order to be statistically robust, a detailed geostatistic sampling design followed by a first field approach to contrast ground truth.
Nevertheless, some clear spatial patterns emerge. For example, 87 Sr/ 86 Sr values higher than 0.712 that concentrate mostly in the southwestern regions will be probably generalized to most of the Iberian Massif (see Fig. 5). These more radiogenic siliceous lithologies (granite, gneiss, slate, quartzite, schists or sandstones) cover much of the western half of the peninsula and some of the main mountain chains, such as the Pyrenees or the Central mountain range. Individuals within those values located in non-radiogenic regions are thus expected to be clearly recognizable. Inversely, the same can be said for 87 Sr/ 86 Sr values lower than 0.7085, clearly associated with calcareous regions, and mean values of 0.709-0.710 related to the Cenozoic basin.

An isoscape map for Central Iberia
We have chosen an area of 13.000 km 2 in Central Iberia, roughly the region of Madrid, in order to observe the effect of scale in the resulting isoscapes. The region combines two significant characteristics: it is one of the most densely sampled areas in the peninsula, and includes the complete range of 87 Sr/ 86 Sr values documented for the whole of Iberia (0.7078-0.7205). This makes the region an excellent laboratory to determine the correlation between 87 Sr/ 86 Sr values and specific lithologies.
The database for the 13.000 km 2 isoscape for Central Iberia has 149 available samples for 23 locations, all from archaeological contexts with known values and precise coordinates. Although not as patent as the results for all Iberia, the Average Nearest Neighbor Analysis 6 confirms the existence of a clear clustered pattern, while Moran's I index shows a strong positive spatial autocorrelation. 7 We have identified 2 outliers, both higher values than those in their surroundings. The resulting interpolation and standard error maps are thus based on 147 87 Sr/ 86 Sr values. Kriging parameters and results are provided in Table 10.
The isoscape (Fig. 6) realistically reflects the lithological units of the region and their gradation, from northeast radiogenic granitic and metamorphic rocks of the Central System to the least radiogenic Tertiary limestones of the southeast. The transitional zone, in fact the most sampled of all, shows the characteristic intermediate values typical of substrata lithologies such as arkoses, clays and gypsum (0.709-0.710). Finally, two locations with low isotopic signatures stand out to the northeast, accurately representing the strip of Cretaceous limestone in which the two sampled caves are located, El Rebollosillo (Díaz-del-Río et al., 2017a) and El Destete (Jiménez and Alcolea 2001).
Along with the utility for those of us working on population paleomobility in this region (e.g. Díaz-del-Río et al., 2017b), this isoscape demonstrates the effectiveness of selective sampling in areas with relatively low or clear-cut lithological variability. Nevertheless, it is also an extreme example of the underlying problem of equifinality. The selected 13.000 km 2 of Central Iberia cover the complete range of 87 Sr/ 86 Sr values for all the Iberian Peninsula. As the principles of parsimony should prevail when interpreting 87 Sr/ 86 Sr paleomobility results, any potential non-regional human or animal will inevitably be considered of regional origin, highlighting the need for a multiproxy approach that includes 87 Sr/ 86 Sr, δ 34 S, δ 18 O and δ 13 C (Holt et al. 2021:12).

Conclusions
We have reviewed 12 years of strontium research on paleomobility in Iberia. Although a very young, collective and uncoordinated research program, its cumulative effect is already offering reasonably good possibilities for the study of regional and inter-regional past human and animal mobility patterns.
The quality of sample description has substantially improved throughout these 12 years. Precise locations, contexts and sample details are mostly well reported nowadays. Although most of the projects have random, non-contextually nor geostatistically oriented sampling strategies, they have had an overall positive cumulative effect. Two strategies for research may improve this outcome. The first involves broadening the geographical limits of the actual sample towards uncovered areas of Iberia, a strategy that may benefit from a systematic sampling of environmental proxies in those areas with limited available archaeological samples (e.g. James et al. 2022). Nevertheless, further regional focused research would be required, as the already available environmental and archaeological results are not directly comparable because of the different nature and size of samples. The second entails expanding the temporal and increasing the contextual quality of samples in regions that already have a reasonably good data set. Certainly, a more nuanced, contextual and problem-oriented approach will increase the interpretative potential of the data set. Both strategies will have a direct effect on the quality and accuracy of future isoscapes. In the meantime, archaeologists should be conscious of these limitations when building their cultural interpretations.
The possibility of building a bottom-up collaboration, through the already existing open-access online platforms or through any other, is an advantage that was not available to Iberian researchers during the emergence of many other data-sets. This was the case for the already substantial radiocarbon chronology database (Idearq n.d.) or the results of long-term archaeometallurgical research programs (Rovira and Montero 2018). This collective effort is achievable given that there are

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
All data used is attached and available on line at www.idearq.org