This thesis presents a highly flexible framework for generic computer vision. The framework is implemented as an essentially object-oriented blackboard system and can easily be modified for new application domains. This has been achieved by allowing application-specific knowledge representation and data representation to be defined in terms of generic system prototypes. Using the object-oriented programming/frames paradigm allows application-specific elements of the system to inherit interpretation strategies for finding objects, and methods for calculating measurements of their features. Furthermore, the compositional structure of objects and their inter-relationships can be represented.
The system automatically generates control strategies for the current domain. Interpretation of an object consists of executing a number of interpretation strategies for that object, which may be interspersed amongst other interpretation tasks and thus termed dynamic interpretation strategies. Confidence ratings for object hypotheses, created by the interpretation strategies, are evaluated and combined consistently. The 'best' hypotheses are stored on the blackboard and used to guide subsequent processing. The division of an object's interpretation into stages facilitates the early posting of tentative hypotheses on the blackboard and the system concurrently considers alternative competing hypotheses.
The developed system currently performs region-based image analysis, although the framework can be extended to incorporate edge-based and motion-based analysis. A uniform and consistent approach has been adopted to all objects, including object-parts, and all application specific knowledge is made explicit. New interpretation strategies can easily be incorporated.
A review of related research and background theory is included. Results of example interpretation experiments, covering various applications, are provided for an implementation of the framework on both real and simulated images.