Wednesday, September 25, 2013

Introducing imagemetrics

References

In my recent projects, I had the opportunity to work with the professor Raphaël Proulx who introduced me to several metrics commonly used in landscape ecology for quantifying image texture. In order to make my life easier, I decided to implement them as a R package. At this time, the package is still under development and so better documentation/debugging/testing are still needed. Nevertheless, here is a short tutorial on how to use the package.

Step 1: Install the package

The package is currently hosted on Bitbucket and can be installed using devtools.

library(devtools)

install_bitbucket("imagemetrics", "persican")
library(imagemetrics)

Step 2: Open a raster image

Indeed, the first thing to do is to open the image on which you want to calculate the metrics.

## Open the R logo and average on calculate the average on R,G,B channels
r = brick(system.file("external/rlogo.grd", package = "raster"))
r = mean(r)

## Plot the raster (optional)
plot(r, useRaster = FALSE, col = gray((0:100)/100))

plot of chunk chunk_9b

Step 3: Extract pixels from the image to calculate occurrence probabilities

It is worth mentioning that the metrics are calculated on a probability matrix that represents the chances of getting a specific pair of pixel values. For each pixel in the image, we have to choose a neighbor that is located either to the right, bottom right or bottom of a reference pixel (see the following image).

Before calculating such probabilities, we have to extract the values of both reference and neighbor pixels. To do so, the user can use getImagePixels(r, side). The function takes as parameters a raster image (r) and side, a numeric value specifying the neighbor to use. side = 1 for the right pixel, side = 2 for the lower right pixel, side = 3 the bottom pixel.

## Get bottom pixel (side = 3)
v = getImagePixels(r, side = 3)

str(v)
## List of 2
##  $ reference_vector: num [1:7676] 255 255 255 255 255 255 255 255 255 255 ...
##  $ neighbour_vector: num [1:7676] 255 255 255 255 255 255 255 255 255 255 ...

Step 4: Calculate occurrence probabilities

To calculate the probabilities, simply use calculateHisto(reference_vector, neighbour_vector, nbins, where reference_vector and neighbour_vector are vectors returned by getImagePixels and nbins is a numerical value indicating the bin size used to compute histograms.

From Mellin et al. 2012:

“For \( M \) classes of values, the number of possible configurations in a \( k-pixel \) neighborhood is \( M^k \). To ensure that each possible configuration has a reasonable probability of occurring in an image, it is generally recommended that the ratio of the total number of pixels in the image to \( M^k \) be greater than 100.”

suggestMaximumBins® can be used to determine the number of maximum bins to use.

suggestMaximumBins(r)
## [1] 8

## Calculate probability
r.prob = calculateHisto(reference_vector = v$reference_vector, neighbour_vector = v$neighbour_vector, 
    nbins = 6)

Step 5: Compute the metrics

Metrics are thereafter calculated using the object returned by calculateHisto.

contagion(r.prob)
## [1] 0.3469
contrast(r.prob)
## [1] 0.02727
evenness(r.prob)
## [1] 0.8393
homogeneity(r.prob)
## [1] 0.848
jointEntropy(r.prob)
## [1] 2.341
marginalEntropy(r.prob)
## [1] 1.504
maximumMutualInformation(r.prob)
## [1] 0.3724
meanInformationGain(r.prob)
## [1] 0.4669

References

Mellin, C., Parrott, L., Andréfouët, S., Bradshaw, C.J.A. a, MacNeil, M.A. & Caley, M.J. (2012) Multi-scale marine biodiversity patterns inferred efficiently from habitat image processing. Ecological Applications, 22, 792–803.

Proulx, R. & Parrott, L. (2008) Measures of structural complexity in digital images for monitoring the ecological signature of an old-growth forest ecosystem. Ecological Indicators, 8, 270–284.

Sunday, July 7, 2013

Estimating ODE's parameters

In a previous post I used R to solve Ordinary Differential Equations (ODE). Since I'm still more familiar with Matlab to work with ODE, I decided to go one step further and learn how to use R to estimate parameters in ODE.

In this short tutorial, I will use the ODE presented here which quantifies the salt concentration in a well-stirred tank:

\[ \begin{align} &\frac{dx}{dt}=a\frac{150-x(t)}{200}\\[10pt] \end{align} \]

With \[ x(0) = 20 \]

The analytic solution is:

\[ \begin{align} &x(t)=150-130e^{-at/200} \end{align} \]

Step 1

Generate data using the analytic solution and add some random noise. Note that the ODE uses one parameter that has been fixed at \( a = 0.75 \).

t = seq(1, 800, by = 10)
a = 0.75  ## Fixed parameter used to simulate data.

## Simulate data and add noise.
conc.observed = 150 - 130 * exp(-a * t/200) + rnorm(length(t), sd = 5)

## Plot
plot(t, conc.observed, pch = 21, bg = "gray", ylim = c(20, 180), xlab = "Time", 
    ylab = "Salt (kg)", axes = F)
box(bty = "l")
axis(1, tck = -0.02)
axis(2, tck = -0.02, las = 2)

legend("bottomright", legend = c("Observed data"), pch = 21, pt.bg = "gray", 
    bty = "n")

plot of chunk chunk_8a

Step 2

Write a function that will be used to solve the ODE. While we there, solve it with fixed parameter \( a = 0.75 \).

library(deSolve)

salttank = function(t, state, parameters) {
    with(as.list(c(state, parameters)), {



        # rate of change
        dX = a * ((150 - X)/200)


        # return the rate of change
        list(c(dX))
    })
}


## Solve the ODE. Not necessary at this point.

## Define the initial value for the state variable
state = c(X = 20)

## Time range.
times = seq(1, 800, by = 10)

## Parameters
parameters = c(a = 0.75)

conc.modeled = ode(y = state, times = times, func = salttank, parms = parameters, 
    method = "rk4")

Plot the results.

plot(t, conc.observed, pch = 21, bg = "gray", ylim = c(20, 180), xlab = "Time", 
    ylab = "Salt (kg)", axes = F)
box(bty = "l")
axis(1, tck = -0.02)
axis(2, tck = -0.02, las = 2)
lines(conc.modeled[, "time"], conc.modeled[, "X"], col = "red")

legend("bottomright", legend = c("Observed data", "True solution (a = 0.75)"), 
    lty = c(NA, 1), pch = c(21, NA), col = c("black", "red"), pt.bg = "gray", 
    bty = "n")

plot of chunk chunk_8c

Step 3

This is where the magic happens. To fit parameters, I will use nls.lm() from the minpack.lm package. In its simplest form, the function uses par which is a list of guessed parameters and fn a function used to minimize the sum square of the residuals using the Levenberg-Marquardt algorithm.

Now we have to write a function (here I named it ssq) that will use ODE parameters as input (in this case only \( a \)) and produces a residuals vector as output.

ssq = function(params) {

    ## Range and initial condition as before.
    times = seq(1, 800, by = 10)
    state = c(X = 20)

    ## Resolve the ODE.
    out = ode(y = state, times = times, func = salttank, parms = params, method = "rk4")

    ## modeled - observed
    ssq = out[, "X"] - conc.observed
}

Step 4

Finally, we can estimate the parameter \( a \).


library(minpack.lm)

## Start with a guess a = 1
params.guessed = c(a = 1)
params.fitted = nls.lm(par = params.guessed, fn = ssq)

summary(params.fitted)
## 
## Parameters:
##   Estimate Std. Error t value Pr(>|t|)    
## a   0.7438     0.0113      66   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.93 on 79 degrees of freedom
## Number of iterations to termination: 5 
## Reason for termination: Relative error in the sum of squares is at most `ftol'.

We see from the summary of params.fitted that the estimated parameter is \( a = 0.7438 \) which is obviously very close to \( a=0.75 \) used to produce the observed data.

Step 5

Finally, simulate the data using the estimated parameter and plot the results.

## Simulate using fitted parameters
params = coef(params.fitted)

conc.modeled3 = ode(y = state, times = times, func = salttank, parms = params, 
    method = "rk4")
plot(t, conc.observed, pch = 21, bg = "gray", ylim = c(20, 180), xlab = "Time", 
    ylab = "Salt (kg)", axes = F)
box(bty = "l")
axis(1, tck = -0.02)
axis(2, tck = -0.02, las = 2)
lines(conc.modeled[, "time"], conc.modeled[, "X"], col = "red")
lines(conc.modeled3[, "time"], conc.modeled3[, "X"], col = "blue")

legend("bottomright", legend = c("Observed data", "True solution (a = 0.75)", 
    paste("Simulated with fitted parameters (a = ", sprintf("%2.2f", params), 
        ")", sep = "")), lty = c(NA, 1, 1), pch = c(21, NA, NA), col = c("black", 
    "red", "blue"), pt.bg = "gray", bty = "n")

plot of chunk chunk_8g