Wednesday, 23 November 2011

Archiving for fraud detection

From: Chris Morris
Date: 4 November 2011 08:09


One argument for archiving images has been that reprocessing could demonstrate deliberately deceptive structures.

In fact, what is needed for this is not necessarily the image. It is the last data file that was produced by a trusted computer. If the structure depends on mtz files produced at the synchrotron, then it is sufficient to authenticate the reduced data. The images are only needed for this purpose if they have been reprocessed.

regards,
Chris
____________________________________________
Chris Morris  
Daresbury Lab,  Daresbury,  Warrington,  UK,  WA4 4AD


----------
From: Zhijie Li


If the data files generated from trusted computers carry digital signatures it would be more trustworthy. Otherwise, a person with proper knowledge can still manipulate the data files, even if it is in binary. If the image processing software routinely incorporate encrypted key information of the original data to the final data files, then data from any computer can be trusted. This would be best considering that in real life we often have to combat the ice rings, splitting reflections, etc., at home.

For example, if the frames used for indexing the dataset are encrypted and saved with or within the final data file as "proof of experiment", in a universal format that can be used by the structure deposition servers to verify the reported space group, resolution, and, to some degree, the structure itself, it would probably serve the purpose.

Zhijie



----------
From: James Stroud


Although this is a good idea from the perspective of storage, it is difficult to implement. 

For this idea to work, you need a (1) certificate system, (2) certificate authority. The certification is necessary to verify that the data file was indeed generated by a trusted computer. The chosen file needs to be certified by the authority and the certification archived on a trusted system. None of these requirements are terribly problematic. The infrastructure for a certificate system is free in the form of openSSL. Almost any lab or institution could easily become a certificate authority. The storage requirements for the certificates are trivial. For example, if a certificate were 2 KB, then, for the 8,000 structures per year, the storage requirements would be 1.6 MB. After 1000 years, we would fill up my $14.95 2 GB thumb drive.

The difficulty is that certification should be done on the file before it is transferred from the trusted computer. This requires inserting the certification process somewhere in the transfer pipeline, which is difficult because it requires all the synchrotrons to actually implement it. Allowing the user to produce the certificate after transfer is as useful as having no certificate system at all.

Then there is the issue of data collection on a home source.

James



No comments:

Post a Comment