Data analysis and Geostatistics
Data analysis and Geostatistics
2025
Exercises
Lab 1: Data description of a univariate temperature time series
Exercise_1 Excel file with data and questions
Lab 2: Assessing data quality and levelling datasets
Exercise_2 Excel file with data and questions
Overview of the dumps of a Ni-PGE mine in Botswana.
Copyright: Vincent van Hinsberg & Simon Vriend
Last updated: September 2025
Labs - applying statistical tools to geological data
The best approach to learning (geo)statistical tools, interpreting their results, and identifying how and where these tools can aid in understanding data, is in working with real-world geo-data. We will do in this in two ways in the lab component of this course. In order to become familiar with calculating statistical properties in spreadsheet programs and the PAST statistics software, and to highlight certain statistical methods, approaches and properties, there will be a set of exercises. In a second component, you will work towards progressively understanding a large dataset of litho-geochemical samples from BC using the various statistical tools and approaches that have been discussed in the lectures. The dataset contains geological information, element concentrations and field observations, which will have to be explored in combination.
The datasets are original, unmodified data as provided by a variety of laboratories and should therefore be thoroughly checked before starting with your analysis and interpretation. All datasets contain a wealth of statistically interesting features and it is impossible to discover all. That is not the point of the lab and I will not grade your report on whether or not you found everything and tried every technique. The purpose is to dissect and understand the dataset so that you are able to interpret the data in a geological and geochemical context, and your reports will be graded on the level of insight into these data. There are many ways to dissect a dataset and there are generally a variety of statistical techniques that will lead you to the same conclusion. So feel free to attack this dataset in whatever way you like, but the following statistical tools should at least be included;
• data description (e.g. mean, IQR, median, mode etc)
• scatter diagrams, box-and-whiskers plots, histograms
• tests of distribution, cumulative frequency diagrams
• correlation tests and correlation matrices
• t-tests, F-tests or their rank-equivalents
• analysis of variance
• cluster and/or discriminant function analysis
• principle component and/or factor analysis
• (multiple) regression analysis
• spatial analysis of the data, maps, semivariograms
You are strongly encouraged to work on the exercises and project together and are free to work on these during the labs or at any time convenient to you. The instructor is available during lab times to discuss the exercises and to answer any questions regarding the project. A final report on the data analysis project is to be handed in at the end of the course and counts 40% towards the final grade This final report should be approximately 10 pages in length, excluding tables and figures (these are recommended to be put into appendices). One report is to be submitted per group of 2-3. The exercises are not marked.