**February 23, 1999**

This is a note explaining the status of my PhD project and the plans for the thesis. One article is submitted, two is well underway, and one or two more are planned. The requirement for PhD courses is fulfilled and there has been two longer visits at foreign departments.

This paper is the written note required in connection with the midterm seminars for PhD students at KVL. This paper is organized as follows: this introduction explains the background for the project and the overall plan for the thesis. Sections 2 and 3 explains the work already done or well underway, whereas Section 4 discusses the plan for the future work. Appendices A, B, and C list my planned publications, longer visits, and PhD courses, respectively.

The overall idea in the project is to identify tree tops in aerial photos of Norway spruce. The project can be seen as a continuation of [Dra97] (especially [DR96] and [DR97]), and is very closely connected to [Lar97] and [LR98a].

In forestry there is a wish to get information about the number and volume of trees in the forest. This will help management and planning of the forest, as well as the positions of the trees. So far, the expenses in terms of working hours and money to obtain this information has prevented a detailed inventory. Analysis of aerial photos seem to promise a less expensive way to get detailed information on the individual tree. Further background information can be found in [Dra97].

The plan is to make the thesis from the papers listed in Appendix A together with one or more introductory sections and a discussion. One of the five publications listed in Appendix A is submitted and two is well underway. Work on the publications 4 and 5 has not started yet, and they are further discussed in Section 4.

This section describes the work in connection with estimation of tree top positions in aerial photos.

The upper part of Figure 1 shows an aerial photo from a flight 560m above a thinning experiment in Norway spruce. The goal is to identify the positions of tree tops in the image.

**Figure:** Image with sidelighted trees, and, in the lower part,
171 *X*-points (centres of circles) corresponding to `true' tree
tops and 206 *Y*-points (dots) corresponding to template
matching. The area of the delineated subplot is 4454 m ,
and the unit of the axes in the lower part is linear pixel size,
0.15 m.

Morten Larsen has in [Lar97] and [LR98a] developed a template model for one tree taking into account the positions of the camera and light sources. The resulting template, shown in the right part of Figure 2 bounded by an ellipse, was moved pixelwise over the image. Local maxima of the correlation between template and image pixel grey levels were considered as candidate positions of tree tops. The left part of Figure 2 sketches the model for light reflection within a tree.

**Figure 2:** Model tree and, in the right part, template with optimal
bounding ellipse.

The lower part of Figure 1 shows a map of the
image in the upper part. Both the true positions of tree tops
(denoted by circles) and the estimated positions as found by the
template matching method (denoted by dots) are shown. Let
, , denote the positions of
the true points and let ,
denote the positions found by the template matching method. We want
to estimate the true positions *X* but do instead observe *Y*. The
template method in [LR98a] gives 570 candidate positions
for tree tops, but only the 206 best candidates are used here.
See [LR98b] for full details.

As noted above, the *X* and *Y* point sets are not identical. The
idea in the inclusion of more *Y* points than the present number of
*X* points in the data is two fold:

- We can easily think of situations where the number of
*X*points is unknown. - By the use of a model for the transition we
hope to improve the estimation of
*X*(and derived statistics) compared to [LR98a]. This is the goal of Section 2.4.

Suppose that *Y* is generated from the *X*-process by the following
disturbance mechanisms:

**(i)***Thinning*. Each point , is thinned with probability and retained with probability . If an*X*-point is thinned, then there will not be any corresponding*Y*-point. Thinnings are assumed to be independent for different points.**(ii)***Displacement*. For each remaining point a corresponding*Y*-point is generated by displacement to a position with probability density with respect to the Lebesgue measure on . Given*X*, the displacements of different points are independent, mutually and of the thinnings.**(iii)***Censoring*. The displaced points are observed if they are within an observation area*A*; otherwise they are censored and not observed. Thus censoring of an unthinned point generated by occurs with probability . (Here denotes the complement of the set*A*.)**(iv)***Superposition of ghost points*. In addition to the points generated as described above we have superposition of extra 'ghost' points. These points are assumed to arise from a Poisson process on*A*with intensity , where*X*, as above, denotes the entire*X*-process including thinned points. Given*X*, the ghost points are assumed to be independent of thinning, displacement and censoring.

So far I have used a homogeneous thinning probability, *p*(*x*)=*p*, a
normal distribution as the displacement distribution, and
homogeneous poisson noise .

We let denote the set of one-to-one mappings for two finite sets *M* and *M*' with the same
number of elements. The likelihood function for the observation of
*Y* given *X* is given by Theorem 1.

Estimation of the parameters in the model (1) is
difficult due to the very large sum. The sum is over all possible
ways to pair *X* and *Y* points. We will call such a pairing
a state. The state tells which *Y* points that
comes from the Poisson noise, and which *Y* points that are paired
to which *X* points. We will consider estimation of the parameters
in the model in case both *X* and *Y* is observed.

The problem can be considered as a missing data problem with the
information on the state *s* as the missing data. I have suggested a
method based on the concept of ``neighbours'' to a state *s* that
does approximate likelihood analysis. The crucial issue in the
approximate likelihood computation is to find states
such that the corresponding terms give large
contributions to (1), and then focus on only a small
number of terms. This is achieved by use of a
deterministic, iterative algorithm which consists of a starting
procedure for finding an initial set of states and local
maximizations over suitably chosen neighbourhoods of states until no
further improvement is obtained. A neighbour of a state *s* is a
state *s*' which is only slightly different from *s*.

The paper [LR98b] covers the problems presented in Section 2.2 and 2.3. This paper is finished and submitted.

The paper [LPR99] covers the problems described in this section. It is work in progress and it is estimated to be finished in April.

The idea is to use the model (1) to reconstruct *X* when
the parameters *p*, , ..., are known. The knowledge of
the parameters can e.g. be obtained through training data sets
observed under similar conditions as the current data.

Problems similar to this one can be found in [BvL93], [CL98], [DR95]. The first is about a point process description of an image, and the latter two is about detection of mines in minefields.

We will use a Bayesian approach and have a prior distribution on *X*.
The prior distribution should be a regular point pattern, which
expresses that trees in planted and managed forests (as our images)
tend to be placed regularly in the area.

Next, our problem is to say something about the posterier distribution . We will explore this posterior distribution by MCMC samples from the distribution. We make a sampler of Metropolis-Hasting type along the lines in [Gre95] and [GM94]. Some inital experiments with this sampler is shown in Figures 3, 4, and 5.

**Figure 3:** The number of simulated *X* point and the number of matched *X* points in a simulation of length 10000 starting from the empty point configuration.

**Figure:** Same as Figure 3, but without the first 1000 observations.

**Figure 5:** Simulation number 9999. Both the true *X* points (circles),
the observed *Y* points (dots) and the simulated *X* points
(x's) are marked.

The above simulations are made with a Poisson process as the prior distribution. The plan for the paper is to use a prior with regular point patterns, but some minor problems must be solved. Also some work on how to best present the results remain.

The work described in this section is documented in [Lun98] and is not connected to the work described in Section 2. A model for survival times of Sitka spruce based on a discretised version of a Cox model is suggested. The positions af all the trees are known, and a competition index is used as a covariate to take the spatial competition between trees into account. The plan is to write a joint paper together with Jens Peter Skovsgaard based on the course report.

In this section some ideas for the future work is outlined.

In continuation of the work in Section 2 it is quite natural to improve the estimation of the positions by use of several images at the same time. The idea is to estimate the tree top positions in 3-dimensional real world coordinates. Each image only contains information on 2 dimensions and it might not be the same trees that are found in every image. My impression is that a pretty good reconstruction can be made by, say, 3-4 images. A lot of the ground work for this is already done, but of course there is still a lot of outstanding issues.

The thesis could need a more theoretical paper than the ones mentioned above. One idea might be to look into the area of perfect simulation. During my stay in Jyväskylä I participated in a study group about some articles on perfect simulation, and the summer school on coupling methods in Göteborg this summer mentioned in Appendix C is also relevant for this subject.

This summer I have the following plans for conferences, etc.:

- April 12-23: Participation in the workshop ``Spatial modelling, inference and computation in image analysis'' and the visitors program in Göteborg.
- July 11-16: HSSS workshop on ``Statistical modelling of spatial and space-time processes'', Luminy, France.
- August 24-28: Participation in the 11th European Young Statisticians Meeting, Marly-le-Roi, France.

Under consideration is the following events:

- June 14-19: Summerschool on ``Coupling Methods in Probability'', Ellös, Sweden.
- September 14-18: Second European conference on highly structured stochastic systems. (HSSS)

This section describes the planned articles for the thesis.

- Self-thinning among trees. This article will be joint work with Jens Peter Skovsgaard and will be based on the course report [Lun98]. The course report is finished, and work on the article will start in spring 1999.
- [LR98b] is a submitted article about the problems described in Section 2.3.
- [LPR99] is work in progress about the problems described in Section 2.4.
- An article about 3D reconstruction of the tree top positions is planned. See Section 4.1.
- A more theoretical work on perhaps perfect simulation of point processes. This is more speculative. See Section 4.2.

Two longer visits have taken place:

- Feb-May 1997: Department of Mathematics, Chalmers University of Technology, Göteborg, Sweden. Contact person: Mats Rudemo.
- Oct-Dec 1998: Department of Statistics, University of Jyväskylä, Jyväskylä, Finland. Contact person: Antti Penttinen.

A visit at Chalmers during the visitors programme in April-May 1999 is considered.

**Forest Biometrics**- 5 points, held by Jens Peter Skovsgaard, FSL, 1997. The report [Lun98] was part of the course evaluation.
**Percolation with applications**- 3 points, held by Olle Häggström, Chalmers University of Technology, spring 1997.
**Weak convergence**- 2 points, held by Bo Martin Bibby as part of a study group about the book [vdVW96], fall 1997.
**European Summer School, Markov Chain Monte Carlo Methods**- 1
point, held by Jesper Møller, summer 1998.

The requirement for PhD courses is fulfilled by these courses.

I have considered participating in a summer school ``Coupling methods in probability'', Göteborg, June 14-19, 1999.

**BvL93**-
A. J. Baddeley and M. N. M. van Lieshout,
*Stochastic geometry models in high-level vision*, Advances in Applied Statistics: Statistics and Images**1**(1993), 231-255, Supplement to Journal of Applied Statistics vol. 20 Nos. 5/6 1993,*Statistics and images*, eds. K. V. Mardia and G. K. Kanji. **CL98**-
N. Cressie and A. B. Lawson,
*Bayesian hierarchical analysis of minefield data*, Proc. SPIE Vol. 3392, Detection and Remediation Technologies for Mines and Minelike Targets III, Abinash C. Dubey; James F. Harvey; J. Thomas Broach; Eds., 1998, pp. 930-940. **DR95**-
Abhijit Dasgupta and Adrian Raftery,
*Detecting features in spatial point processes with clutter via model-based clustering*, Technical Report 295, Department of Statistics, University of Washington, October 1995, A revised version appeared in Journal of the American Statistical Association 93(1998):294-302. **DR96**-
Kim Dralle and Mats Rudemo,
*Stem number estimation by kernel smoothing of aerial photos*, Canadian Journal of Forest Research**26**(1996), 1228-1236. **DR97**-
Kim Dralle and Mats Rudemo,
*Automatic estimation of individual tree positions from aerial photos*, Canadian Journal of Forest Research**27**(1997), 1728-1736. **Dra97**-
Kim Dralle,
*Locating trees by digital image processing of aerial photos*, Ph. D. Thesis, Dina Research Report No. 58, The Royal Veterinary and Agricultural University, March 1997. **GM94**-
Charles J. Geyer and Jesper Møller,
*Simulation procedures and likelihood inference for spatial point processes*, Scandinavian Journal of Statistics**21**(1994), no. 4, 359-373. **Gre95**-
Peter J. Green,
*Reversible jump Markov chain Monte Carlo computation and Bayesian model determination*, Biometrika**82**(1995), no. 4, 711-732. **Lar97**-
Morten Larsen,
*Crown modelling to find tree top positions in aerial photographs, In the third international airborne remote sensing conference and exhibition*, (Copenhagen, Denmark), vol. II, 7-10 July 1997, pp. 428-435. **LPR99**-
Jens Lund, Antti Penttinen, and Mats Rudemo,
*Reconstruction of a point process from a disturbed observation*, Report, Department of Mathematics and Physics, The Royal Veterinary and Agricultural University, 1999, In preparation. **LR98a**-
Morten Larsen and Mats Rudemo,
*Optimizing templates for finding trees in aerial photographs*, Pattern Recognition Letters**19**(1998), 1153-1162. **LR98b**-
Jens Lund and Mats Rudemo,
*Models for point processes observed with noise*, Report 10, Department of Mathematics and Physics, The Royal Veterinary and Agricultural University, November 4 1998, Submitted. **Lun98**-
Jens Lund,
*Survival of the fattest? Self-thinning among trees*, Report 3, The Royal Veterinary and Agricultural University, Department of Mathematics and Physics, 1998, Course report from PhD course in forest biometrics. **vdVW96**-
Aad W. van der Vaart and Jon A. Wellner,
*Weak convergence and empirical processes*, Springer Series in Statistics, Springer-Verlag, 1996.

**Spatial models related to trees**

This document was generated using the **LaTeX**2`HTML` translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:

**latex2html** `-split 0 -show_section_numbers midterm.tex`.

The translation was initiated by Jens Lund on Fri Feb 26 11:37:22 MET 1999

- ...Lund
- Address: The Royal Veterinary and
Agricultural University, Department of Mathematics and Physics,
Thorvaldsensvej 40, DK - 1871 Frederiksberg C, Denmark,
e-mail:
`jlund(at)dina.kvl.dk`. - ...distribution
- In order to avoid the huge sum
in (1) we do consider the state
*s*as missing information here. Written out in full, the expression for the posterior distribution is 25#25. The trick is that 26#26 is very simple.

Fri Feb 26 11:37:22 MET 1999