Artificial neural networks for object recognition, a subfield of artificial intelligence (AI), are increasingly being used in industry, for example in the implementation of autonomous driving functions. These networks are trained using large amounts of data. Creating the image data for this is a very time-consuming process, as each image usually has to be labeled manually. Labels are additional pieces of information that identify objects in the image, making them usable for further processing with AI methods.
For object recognition systems in safety-critical areas such as automotive and railway technology, there is the additional challenge that critical scenarios are often inadequately represented in the training data set, precisely because they are usually contextually rare or unexpected events. Such disruptive events can include extreme weather conditions, for example, but also camera errors such as lens distortion or chromatic aberration. However, verification of the system for rarely occurring hazardous situations is absolutely essential.
If a training data set contains too little image data for a meaningful test, new test images must be generated. Manually creating test images is often very time-consuming or simply impractical due to the amount of data required. However, by specifically modifying existing image information, new test data can be generated and made usable for the test.
Metamorphic image transformations are changes made to images that do not affect the label of the image. For example, if the brightness of an image is reduced, the image changes, but the position and type of objects recognizable in it do not.
When creating metamorphically altered images, it is important to systematically consider the potential area of application for object recognition. What errors can occur? What rare situations are conceivable?
One suitable method is classification tree analysis. The paths of the tree describe environmental variables and sources of error organized into categories (see Figure 1). Useful categories for a system used in road or rail traffic include weather and lighting conditions, motion disturbances, and camera hardware and software errors.Metamorphic image transformations are changes made to images that do not affect the label of the image. For example, if the brightness of an image is reduced, the image changes, but the position and type of objects recognizable in it do not.
The disturbances contained in such a tree are then simulated, i.e., generated via image modifications, without changing the labels contained in the image. Otherwise, the image material must be relabeled manually. Changes such as inserting and completely covering objects are therefore not useful.
To ensure that the correct labels are retained, image data generation in this approach is carried out exclusively with the aid of deterministic algorithms, as these always produce identical output for the same input. Other approaches to data generation, such as neural networks, are less suitable for verifying object recognition systems, as their probabilistic nature introduces uncertainties into the verification process.
For interference simulation, simple changes to the color space and pixel values can be made, or more complex algorithms can be used (see Figure 2). The simulation should also be parameterizable so that the strength of the image change can be adjusted. This allows the system to be tested under varying degrees of condition changes.
As part of the AI-LOK research project, ITPower Solutions created several such algorithms and used them to generate test data. The Python library Numpy and OpenCV, a library for image processing methods, were used to implement the image transformations.
An example of a transformation is an implemented algorithm that simulates raindrops on a window pane (see Figure 3). To simulate the raindrops, refractive circles were randomly placed and deformed in the image. The position and size of the drops were determined randomly, and the number of drops can be set by the user. A probability distribution was used to ensure that smaller drops occur more frequently than large ones. The height of each drop at each position was then determined. This is defined by a cosine function and applied with simplex noise so that the drops have a random shape. The normal vector of the drop at the current position is then determined. Only the X and Y values of this vector are used and subtracted from the current position in the image to create a refraction effect. An excerpt of the source code for this algorithm is shown in Figure 4.
To evaluate the extent to which metamorphic transformations can disrupt an object recognition system, additional transformation algorithms were implemented. These included simple effects such as darkening the image, but also complex weather effects such as heat haze, fog, or snow.
As part of the project, these metamorphic transformations were applied to an object recognition dataset to investigate the effects on a system trained with it. The dataset is Microsoft COCO. This is a comprehensive dataset for object recognition, image segmentation, and labeling, which is often used for machine learning and image recognition projects.
The images transformed in this way were then used to test an object recognition system as an example. To do this, transformations of varying strengths were applied and the accuracy of the system was compared with that of recognizing unaltered images. The system under test (SUT) is the Centernet Hourglass neural network by Duan et al (2019). This network has a high degree of accuracy and is therefore particularly well suited for evaluating metamorphic transformations.
Figure 5 shows an excerpt from the results of the experiment. With unchanged images, the accuracy is highest at 0.52. The accuracy decreases to varying degrees as the transformation strength increases. It is clear to see that the object recognition system is most severely disrupted by directional motion blur effects. The method therefore provides indications of where development work needs to focus. In this case, measures must be taken to prevent or minimize the occurrence of motion blur.
Based on the question of whether metamorphic image transformations are suitable for testing an object recognition system, various transformations were implemented in the approach described above, applied to a data set, and then an example system was tested with the transformed images. The results show that metamorphic transformations are indeed suitable for generating test data to reveal particular weaknesses in an object recognition system and to increase the reliability of neural networks. The method should also be applicable to the generation of training data.
I am your sales representative and will be happy to advise you on all questions relating to our services and products! Get in touch or simply make an appointment for a free consultation call.
Sebastian Stritz
E-Mail: sebastian.stritz@itpower.de
Phone: +49 (0)30 6098501-17
