When it comes to detecting image manipulation, the more tools you have at your disposal, the better. In a recent issue of Science and Engineering Ethics, Lars Koppers at TU Dortmund University in Germany and his colleagues present a new way to scan images. Specifically, they created an open-source software that compares pixels within or between images, looking for similarities, which can signify portions of an image has been duplicated or deleted. Koppers spoke with us about the program described in “Towards a Systematic Screening Tool for Quality Assurance and Semiautomatic Fraud Detection for Images in the Life Sciences,” and how it can be used by others to sleuth out fraud.
Retraction Watch: Can you briefly describe how your screening system works?
Lars Koppers: We write some functions for the programming language R which compare pixels or neighborhoods of pixels between different images or in different areas of one image. In many images, many pixels are identical by chance. But if there are two regions with identical pixels in the same relative position to one another, this could be a sign for a duplicated area. To find those duplicated areas, we shift two images against each other or one image against itself and count overlying identical pixels. Shifts that include more identical pixel pairs than expected could be selected for further examination, because they could be a sign of deleted data.
RW: You ask a pertinent question – how can you detect when something has been deleted from an image?
LK: If I want to delete existing data in images, I have to replace it. A replacement by a monochrome area (e.g. black or white) is too noticeable. An easy way to hide a deletion is to copy and paste some background from another part of the image. Now this background pattern
exists twice: the original and the pasted version, which should hide the unwanted data. If we want to find signs of deleted data, we have to look for duplicated background. With this procedure, we cannot restore the deleted data, but we make the manipulated areas visible.
RW: How did you determine whether your system is effective at rooting out true manipulation, without false positives?
LK: Our algorithms find outlier-shifts, i.e. shifts including “more than usual” identical pixels. Unusual values are not necessarily the result of a manipulation. A next step could be a visual evaluation by an expert. In an (semi-)automated system, the challenge would be to find a threshold with high sensitivity without too many false positives. The important thing is that findings of the algorithm are not a final judgment on whether the image is manipulated.
RW: Other researchers and journals are already screening images for signs of manipulation – how is your approach different from the rest?
LK: We do not offer a complete software solution or consultation. Our algorithms are open source so everyone can use them or implement them in their own scanning routine. Our aim was to get results which can be implemented in an automated routine. That’s why our output is not a
processed image but a matrix including the number of identical pixels for all possible shifts. This in principle allows an automated scanning routine by filtering only those images that include shifts with a suspicious number of identical pixels. The challenge in an automated process would be choosing the right threshold, because every positive match has to be examined by an expert.
RW: How can other researchers and journals adopt your technique?
LK: Our code is open source. That way, everyone can implement and enhance it in their own scanning routine. Pixelwise comparisons only cover simple copy-and-paste manipulations. They do not work on rescaled images yet. These algorithms are our contribution to a possible toolbox of automated scanning algorithms. Only a variety of different algorithms ensures that all kind of manipulations in images can be found. Developing tools to find manipulated images would have the same problems cryptographs have: Every tool can also be used for optimization of “the other side.” For instance, those who manipulate images can use each algorithm to ensure that their manipulation cannot be found by this algorithm – so having as many tools at your disposal gives you the best chance of catching manipulations.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.