Methods

Measurements and Data

Database: Cyperus papyrus.dbf, and associated files.

Spread sheet: Cyperus papyrus Measurements.xls

Spread sheet: Cyperus papyrus Analysis.xls

Herbarium Specimens

Cyperus papyrus herbarium sheets were selected for analysis according to the criteria described in the spread sheet Cyperus papyrus Measurements:Sheets,  the label information was transcribed into a BRAHMS 7.20, Rapid Data Entry database Cyperus papyrus.dbf , resulting in 36 records.  All sheets were given a unique accession number in BRAHMS, which is consistently used throughout the data analysis and referred to as sheet number in the data measurement and analysis results.

Characters & Measurements

Characters were selected to complement but not replicate previous monographs.   Factors influencing character choice included expediency, there being limited opportunity to remove and dissect material from herbarium specimens and microscopic examination in situ being extremely time-consuming.  Characters that might be usable in the field to segregate taxa, without reporting to microscopic examination, were also preferred.

In all, 39 characters were measured or observed, listed in full in Cyperus papyrus Measurements:Legend & Chars, the raw measurements are recorded in Cyperus papyrus:Raw Scores.  For each character the list contains: character notation and brief description, units of measurement, repetitions per sheet where appropriate, and permissible states for categorical character types.  In addition a brief description of each sheet, for example if it contained a whole or partial inflorescence, was recorded in Cyperus papyrus Measurements:Sheets

Following sampling of approx. 10 sheets, from a wide geographic range and showing morphological variation, the character list was revised.  A number of characters were dropped, as not apparently differentiating sheets, while others observed during the sampling were added.  The additional characters were re-scored for the sample sheets.  Dropped characters are retained in Cyperus papyrus:Raw Scores.  Some characters were measured using a binocular microscope, calibration measurements are recorded in the same worksheet.

The structure of the C. papyrus inflorescence is complex.  Therefore, for ease of recording and reference, a particular notation is applied throughout, after one applied to C. esculentus in Holm, Studies in the Cyperaceae XXXV, in Amer. Journ. of Sc. (1904): 302 [not seen] in Kükenthal, 1935: 13.   It is described and illustrated in the Characters tab. 

Notes about issues relating to the characters and their scoring were kept as the work progressed and are recorded in Cyperus papyrus:Raw Scoring Notes, along with further notes about individual sheets.

Locations

In most cases latitudes and longitudes were absent from the labels.  In such cases they were inferred using the online resource provided by Geonames (Wick & Vatant, 2013) and recorded in BRAHMS along with an indication of the degree of confidence and therefore specificity of the inference (field “InfLL”); in some cases it was not possible to infer below country level from the information available.  Where they are this inferred, the maps reflect whether to locality or country level.  The Inference codes used are 0 – no inference, coordinates on label; 1 –location; 2 – area within a Country, e.g. White Nile area of Sudan; 3 – Country only.

Data Analysis

Raw scores were processed to create data suitable for further analysis using statistical packages, for example repetitions of raw measurements were averaged and some ratios calculated where appropriate, for example spikelet length / width, missing values replaced with 0, all recorded in Cyperus papyrus:Proc Scores.  A second stage in this process led to elimination of spread sheet functions and rounding of scores, recorded in Cyperus papyrus:Stats Scores.  A new spread sheet was created for ease of processing, using the data from this worksheet, called Cyperus papyrus Analysis

A small number of characters, listed in Cyperus papyrus Analysis:Legend and in Table RD1 were dropped from the analysis at this stage, either because data was absent from a significant number of specimens,(see Characters tab for notation), or else considered unhelpful, e.g. anther connective shape. 

Three worksheets containing data were created in Cyperus papyrus Analysis, as detailed in Cyperus papyrus Analysis:Legend.  Cyperus papyrus Analysis:Scores All – all characters; Cyperus papyrus Analysis:Sel Quant – quantitative characters only, from those remaining after selection (see above); and Cyprus papyrus Sel All – all types of characters, from those remaining after selection.  A small number of categorical values were not deleted from the Sel Quant quantitative data set values (e.g. B2 number) having previously been converted, where necessary, to numerical values.  It was found by investigation that their inclusion did not significantly affect the outcomes of the statistical analysis.  Spikelet shape was treated as a categorical ordinal variable, as the variation is continuous from ovate / lanceolate to cylindrical. It was found by investigation that their inclusion did not significantly affect the outcomes of the statistical analysis.

The column ‘Clust6’ was added to the worksheets following Cluster analysis (see below).

Principal Component Analysis (PCA) & Cluster Analysis (CA)

Minitab 16.2.3 was used for multivariate PCA and Cluster Analyses.  Data was loaded from Cyperus papyrus Analysis.  PCA was undertaken using worksheet Sel Quant, with default settings, including all characters in the data set except Clust6 (see above and below) and computing all axes.  Cluster analysis was undertaken using the Ward linkage method with a Euclidian distance measure as recommended by the Minitab Multivariate Analysis Manual 2003, variables standardised, set to display distance measures: the most effective of the algorithms available in Minitab, from those recommended by Gelbard et al., 2007 for biological data.  The CA was repeated several times to identify even clusters of data, and the results of a six group cluster ultimately written out to the new column Clust6.

Principal Coordinate Analysis (PCO) & Principal Component Analysis (PCA)

Principal Coordinate Analysis and Principal Component Analysis were undertaken with MVSP 3.22 using the Gower General Similarity coefficient which processes qualitative as well as quantitative data.

Cyperus; Cyperus papyrus

Cyperus; Cyperus papyrus

Cyperus; Cyperus papyrus

Cyperus; Cyperus papyrus

Scratchpads developed and conceived by (alphabetical): Ed Baker, Katherine Bouton Alice Heaton Dimitris Koureas, Laurence Livermore, Dave Roberts, Simon Rycroft, Ben Scott, Vince Smith