Tuesday, November 8, 2011

How criminology and computational statistics can help archaeology...


I've been working for a while (Crema et al 2010, Crema In press) on the issue of temporal uncertainty in archaeological analysis. The reason for such interest emerged while I was trying to do the simplest spatial analysis of the distribution of pithouse dwellings. While making my database I found something similar to the following:

ID,  Name, Date
001,Pithouse A, Kasori E1 phase
002,Pithouse B, Middle Jomon
003,Pithouse C, transition between Kasori E2 and E3 phase.
....

Now... in case you're not an expert of pottery phases of Jomon period (you should, they look good), the Middle Jomon period lasted ca 1,000 years (ca. 5470-4420 cal BP) and the Kasori E1 phase ca 60 years (4900-4840 cal BP), with the latter being a sub-phase of the former.

You can easily imagine my problem here. In order to do any diachronic analysis I have two options. I could either lump a large portion of my data choosing the coarsest resolution I have (Middle Jomon in this case, and hence virtually dismissing the knowledge I have about pithouse A) and then carry on my analysis or I can use only those satisfying the temporal resolution I am interested in, and ignore the rest of the data (thus removing pithouse B in this case and using only A and B).
Both solution is highly unsatisfactory, and frankly the second one involves even omitting part of the available knowledge....

While looking around, I found some papers by Jerry Ratcliffe, a american criminologists who happened to have very similar problems...Imagine you've left your car at 9 AM in the morning, you've worked all day and when you've finished  at 5 PM you found your car to be stolen. You know that the crime happened sometimes between 9AM and 5PM. Now if you are criminologist and you are trying analyse the data of other thefts, you will quickly notice that the great majority of the temporal data involves intervals within which the crime have occurred rather than the precise time of the event. Ratcliffe calls these intervals time-spans and noticed that you might have shorter ones when you have more information (a friend came late to job and noticed the car was already missing at 11 AM) and longer ones where you have less information. The problem of spatio-temporal analysis of crime data is that you have consistently different time-spans in your dataset.... Exactly the same problem WE have...

The solution proposed by Ratcliffe is called aoristic analysis (Ratcliffe 1998) and essentially involves what is called "principle of insufficient reason" and basically assumes that, with other things being equal, if we divide our time-span in equally long time-blocks (e.g. decades or hours) the chance that the event have occurred in any of these will be homogeneously distributed within the time-span. In other words, if you don't have any information, the chance that your car might have been stolen between 9 and 10 AM is equal to the chance that the crime occurred between 3 and 4 PM. Based on this very simple premise, we can provide probabilistic measures to our "events".

Now the problem I was facing is that, probability weights cannot be used for standard analysis. You can enhance perhaps visualisation of the data, and maybe provide broad cumulative sum of the probability as time-series. But If you want to do something more sophisticated you need a non-probabilistic data, since the majority of available tools are not designed to deal with temporal uncertainty.

Then I cam across to Monte-Carlo simulation, The idea itself is very simple. Based on a probability distribution (in this case given by the aoristic analysis) one could simulate all the possible combinations of events, and hence all the possible spatio-temporal patterns that might have occurred. The number will be immensely huge, but if a sufficient degree of knowledge is available, some pattern will occur more frequently than others. Hence by simulating n scenarios, one could compute the proportions of these where a given pattern is observed. This will then provide a likelihood estimate of such pattern.

Adopting Monte-carlo simulation opens an entire array of possibilities. One could in fact use different sources of knowledge, from radiocarbon dates to stratigraphic relations and explore the range of possible spatio-temporal patterns. One should then simply assess each of the possible scenarios and compare the distribution of the outcomes to infer about the past in probabilistic terms...

References


Crema, E. R., Bevan, A. and Lake, M., 2010, A probabilistic framework for assessing spatio-temporal point patterns in the archaeological record, Journal of Archaeological Science,  37, 1118-1130.

Crema, E. R., In press. Aoristic Approaches and Voxel Models for Spatial Analysis. In: Jerem, E., Redő, F. and Szeverényi, V. (ed.) On the Road to Reconstructing the Past. Proceedings of the 36th Annual Conference on Computer Applications and Quantitative Methods in Archaeology.  Budapest: Archeolingua.

Crema, E. R., In press, Modelling Temporal Uncertainty in Archaeological Analysis, Journal of Archaeological Method and Theory (online first). 

Johnson, I., 2004. Aoristic Analysis: seeds of a new approach to mapping archaeological distributions through time. In: Ausserer, K. F., ̈rner, W. B., Goriany, M. and ckl, L. K.-V. (ed.) [Enter the Past] the E-way into the Four Dimensions of Cultural Heritage: CAA2003. BAR International Series 1227.  Oxford: Archaeopress, 448–452.

Ratcliffe, J. H. and McCullagh, M. J., 1998, Aoristic crime analysis, Inernational Journal of Geographical Information Science,  12, 751-764.

Ratcliffe, J. H., 2000, Aoristic analysis: the spatial interpretation of unspecifed temporal events, Inernational Journal of Geographical Information Science,  14, 669-679.

No comments:

Post a Comment