Forest Type Suitability

European forest formations are modelled from the perspective of the habitat suitability of each specific forest group. In this approach each field plot is associated to a species assemblage group. The grouping is defined according to the classification of European forest types (EEA, 2006) at the Forest Category level of assemblage (see Casalegno et al., 2011).

Forest formations modelling can be implemented at different levels of plant aggregation from species to formations and biomes. Forest formation or higher aggregation levels has several advantages: faster processing, detection of shared patterns of environmental/species relationships, and enhanced capacity to synthesise complex data into a more readily interpretable information by scientists and decision makers (Ferrier and Guisan, 2006). In addition to the general assumptions in species habitat suitability modelling, in the forest formation approach it is assumed that formations consist of species with similar distributions defined by environmental gradients, as explained  by the continuum concept (Austin, 1989). Because of these assumptions and the grouping scheme adopted, the analysis is restricted to those categories which best agreed with the previous statement. Therefore, 10 out of the 14 forest categories of the European forest types classification are selected and modelled. The Random Forest classifier (RF) is used. The output map of forest category suitability represents on each grid cell the category that fits better to the corresponding environment.

METHOD
1. Data Preparation
Several datasets are prepared for this approach: an input response/predictor table for model building/fitting and a set of environmental response maps for suitability mapping.

•Bioclimatic and Environmental data
The environmental predictor maps were built using 47 different inputs with a grid cell of 1km. The predictors included: Two soil variables selected from the European Soil Database, six geo-morphological factors derived from the SRTM digital elevation model and 39 bioclimatic factors and indices computed from minimum and maximum monthly averaged temperatures and monthly precipitations from the WORLDCLIM database.

•Field data
Response variables are derived from empirical data which are collected in the field and are integrated in the Forest Focus database. Level I and level II databases updated for the year 2004 were merged. Each field plot is classified according to the classification of European forest types (EEA, 2006) (Table 1).

Table 1 - Forest categories modelled.

Due to their relatively low frequency in the Forest Focus database, the following forest categories were excluded from the analysis: Mire and swamp forests, Floodplain forest, Non riverine alder, birch or aspen forest and Plantations and self-sown exotic forests. Figure 1 shows the distribution of input field samples classified according to their forest category.


Figure 1 - Distribution of the input field data classified according to the European forest types categories.

•Input response / predictor table
The input response/predictor table is implemented by extracting the values of the environmental factors and the forest category existing at each field plot locations. The final table includes one line per plot locations, one column for the forest category (1 to 10 categories) and one column for each one of the 47 environmental predictor variables.

2. Model fitting and validation
The RF classifier is tuned by the number of variables randomly sampled at each iteration and by the number of classification trees within the ensemble. Several parametrisation tests were done for obtaining the best combination. An inbag/out-of-bag dataset for model calibration and external validation is automatically created for each classification tree of the ensemble by RF. This provides a reliable error estimate using data that is randomly withheld from each iteration of tree development. Thus an external dataset for validation is not needed (Breiman, 2001; Lawrence et al., 2006).

RF classifier produces information of the model performances in terms of error percentage rate for the overall model and per class error rate. The following ranges describe the level of model performance according to the out-of-bag estimate of error percentage rate:

•Poor: 0.45 < % error < 1.00
•Fair: 0.30 < % error < 0.45
•Moderate: 0.20 < % error < 0.30
•Good: 0.10 < % error < 0.20
•Very good: % error < 0.10

3. Mapping
Using the resulting RF classification rules and maps of environmental variables we mapped the current distribution of European forest categories suitability (Figure 2).

Figure 2 - Habitat suitability of European forest categories (Casalegno et al., 2011).


RESULTS
During the model parametrisation the number of trees in the RF ensemble was increased to 1500 from the default value of 500 and the variable tuning scored 12 as variable tried at each split. The overall model out-of-bag estimate of error rate is 24%. Single forest categories model performances are shown in table 1.


REFERENCES
•Austin, M.P., Smith, T.M. (1989): A new model for the continuum concept. Plant Ecology, 83, 35–47.
•Breiman, L. (2001): Random Forests. Machine Learning, 45, 5–32.
•Casalegno, S., Amatulli, G., Bastrup-Birk, A., Durrant, T., Pekkarinen, A. (2011): Modelling and mapping the suitability of European forest formations at 1-km resolution, European Journal of Forest Research, 130, 971-981.
•EEA (2006): European forest types - categories and types for sustainable forest management reporting and policy. EEA Technical Report No 9/2006, European Environmental Agency.  Copenhagen.
•Ferrier, S., Guisan, A. (2006): Spatial modelling of biodiversity at the community level. Journal of Applied Ecology, 43, 393–404.
•Lawrence, R.L., Wood, S.D., Sheley, R.L. (2006): Mapping invasive plants using hyperspectral imagery and Breiman Cutler classifications (RandomForest). Remote Sensing of Environment, 100, 356–362.

Mission

As the science and knowledge service of the European Commission, the Joint Research Centre's mission is to support EU policies with independent evidence throughout the whole policy cycle.