Data analysis and Geostatistics

2024

 
 

Statistical techniques provide powerful tools for analyzing and interpreting data, and in this short course you will become familiar with the most commonly used techniques to analyze data in the geosciences. Starting with the most basic statistical parameters, we will gradually move to more complicated multivariate techniques, including cluster analysis, factor analysis and (multiple) regression.


The short course consists of a set of lectures that discuss concepts, theory and tools, and practical lab sessions where the tools introduced in the lectures will be applied to geological datasets. Many statistical analyses and graphs can be prepared using typical spreadsheet programs, but for more advanced statistical analyses we will use the PAST statistics package. It is assumed that you are intimately familiar with analyzing data in spreadsheet programs such as Excel, Calc or QuattroPro, and setting up formulae and functions in these programs. If you need to refresh this, most spreadsheet programs offer help files, both offline and online (for example: Excel and Calc). Although PAST is very powerful, it does not do everything, and you can also have a look at NCSS, which has a 30-day trial version, or use the free statistical routines developed for the Fortran, R and Python programming languages.


Details of the lab can be found here, and lectures will be posted here after each session. The course will be taught in hybrid format, with in-person lectures and labs in the Frank Dawson Adams building of McGill, simultaneously streamed on Zoom for remote participants. Please see your email for Zoom details. Room numbers can be found here.


The short course will mainly focus on the fields of data analysis and statistical testing and modelling. Some aspects of probability analysis will be addressed as well, mainly in relation to confidence intervals and the concept of “statistical proof”. Of further interest is the issue of impartiality of the observer in geological studies as exemplified in the 3-door problem.


An overview of the course, and schedule can be found here.


The book that I recommend for the course is “An introduction to geological data analysis” by Swan and Sandilands (Blackwell publishing, ISBN 0632032243). The main advantage of this book over more generic statistics textbooks is that it is tuned to the statistical techniques relevant to the geosciences. Unfortunately, the book is no longer in print, but second-hand copies are readily available online. The course covers the following chapters;


    chapter 1:    completely

    chapter 2:    2.1, 2.2, 2.4 (except 2.4.5.3/4), 2.5, 2.6 up to 2.6.2.3

    chapter 3:    up to 3.3.2

    chapter 4:    completely

    chapter 7:    7.1 & 7.2 (general concepts only), 7.3, 7.4.3

    chapter 8;    8.1.1, 8.1.2, 8.3, 8.4, 8.5, 8.6 (general concepts only)


There are also many excellent online resources on statistics and data analysis, including a full textbook by the creators of the NCSS software.

Data analysis & Geostatistics: Geotop Short Course

on the use of statistical techniques in the Earth Sciences

Course schedule

- lectures:

  9:15-11:00 and 11:15-13:00

- labs:   

   14:00 - 17:00

Clockwise from top; the Kawah Ijen cra-ter lake in Java, Indonesia; geological map of the Desges valley, French Massif Central; tourmaline-biotite thermometer with uncertainty in formulation and data.


Copyright:     Vincent van Hinsberg & Simon Vriend


Last updated:     March 2024

Examination

- formal written exam:

  50% of final grade

- data analysis project:   

   50% of grade, groups of 3

    consists of analyzing a large       

    geological data set using a

    variety of statistical methods

Course prerequisites

There are no formal prerequisites for this course, but a thorough knowled-ge of spreadsheet programs and their (statistical) functions is assumed