CCP4 Bulletin Board Archive: Overlapmap

From: Brigitte Ziervogel
Date: 4 October 2011 21:55

Hi,

I am using the program Overlapmap to calculate real-space R-factors and correlation coefficients in order to find ligand conformations that fit best within the density.

I'm confused by the Overlapmap output, which includes "Fobs" and "Fcalc" values that are used to calculate the R-factors and corr coeff. However, I'm not sure what these F values are as they should not be structure factors since the program seems to only deal with maps. Additionally, in many cases the Fobs and Fcalc values are either 0 or negative values, even for protein residues that are well-defined in the density.

Has anyone used this program before or have an idea of what could be going on here?

I have been supplying the program with a refmac mtz file with ligand unmodeled as map 1 and a pdb file with both protein and ligand coordinates to calculate the map 2.

Any suggestions or ideas of better ways to score ligand fits are appreciated, thanks.

Brigitte

----------
From: Pavel Afonine

Hi,

On Tue, Oct 4, 2011 at 1:55 PM, Brigitte Ziervogel <bkziervogel@uchicago.edu> wrote:

(...)

another alternative:

phenix.model_vs_data model.pdb data.mtz comprehensive=true

will list triplet of numbers: {map CC, 2mFo-DFc value, mFo-DFc value} for each atom or residues in your structure. Low map CC, low 2mFo-DFc density value, or /and significantly non-zero mFo-DFc values will point out problems.
More info:
http://www.phenix-online.org/documentation/model_vs_data.htm

Let me know if you have questions,
Pavel

----------
From: Adam Ralph

 Hi Brigitte,

    You are correct, the columns labeled Fobs and Fcal refer to density. The columns should be: averaged density for Obs and Cal for the main chain, then averaged density Obs and Cal for the side chains. I have included a version of overlapmap that calculates the R-factor correctly and have changed the above columns in the output.

All the best Adam

----------
From: Ian Tickle

Documentation should be fixed too :)

-- Ian

----------
From: Ian Tickle

Adam, sorry I don't use overlapmap, for the reasons you mention (and many others!). In fact I decided it was in such a mess that it was irrecoverable & so I wrote my own program EDSTATS to do all all these electron density stats (and more). I talked about it at the last CSW (http://www.cse.scitech.ac.uk/events/CCP4_2011/talks/tickle.pdf), and submitted it to CCP4 in January but it doesn't seem to have made the latest release yet (or even the pre-release), so at the moment I'm just distributing it to anyone who's interested.

Cheers

-- Ian

On Wed, Oct 5, 2011 at 12:31 PM, Adam Ralph <adam.ralph@maths.nuim.ie> wrote:

Hi Ian,

Yes I agree. The whole program is a bit of a mess and could do with some updating.
I am not an official CCP4 maintainer any more but I might send them something. Do you
know how to use SFALL ATMMAP? I tired to test overlapmap's residue correlation but did
not work properly.

Adam

----------
From: Eleanor Dodson

Adam, I do know how to use sfall atmmap

Eleanor

----------
From: James Holton

Sounds like what you are trying to do is similar to an old jiffy script of mine:
http://bl831.als.lbl.gov/~jamesh/pickup/local_corr.com

My purpose was to compute the correlation coefficient (CC) for a bunch of different rotamers of a side chain, but I think the script will work for you "as is". What I learned is that you want to make a set of PDB files that contain only the "variable" atoms in the structure (in your case, just the ligands, no protein). Otherwise, the "signal" you are trying to measure is swamped by all the other atoms in the structure. Then you want to "select" map grid points that are near ANY of the atoms in you set of PDB files and score all your PDBs against that SAME map. If you don't do this, you will always find that bigger stuff matches better because bigger stuff simply intersects more density.

I also found it is much better to use the CC of the Laplacian of the electron density maps, rather than the CC of the raw electron density itself. By "better" I mean that a "wrong" rotamer that happens to stick itself into the middle of a nearby helix or heavy metal will "correlate" very well, and at poor resolution it will actually correlate better than the "right" rotamer. This is because the CC (and the R factor) essentially "score" the overall density overlap, whereas the Laplacian seems to "score" how connected the density is. The Laplacian filter does have the unfortunate effect of amplifying the noise of high-angle spots, so applying a B factor after the Laplacian can make things behave better. Exactly which smoothing B factor is optimal is something I have yet to figure out.

I think the reason comparing Laplacian-ized maps works better is because when we mortals look at maps, what we are looking at is an edge detection (we contour the map) next to another kind of edge detection (bonds between atoms). I'm told that comparing Laplacians instead of direct pixels this is a fairly standard methodology in machine vision, but I don't have a reference for that.

As for the real-space R factor, I have always found this to be highly sensitive to the scale and offset of the maps, whereas the correlation coefficient is completely insensitive to scale factors. Since I can't think of anything that the real-space R would tell me that the CC wouldn't, I have always used the latter.

Oh, and if you are getting zero or negative CC for perfectly good models, you might want to check and be sure that SFALL is doing the map calculation properly. A while ago I noticed that if I were missing the CRYST1 line in the PDB file, then SFALL would happily give me a random map, even if I gave it the cell and SG in the input cards! This was probably fixed in the latest release, but I have not checked...

-James Holton
MAD Scientist

CCP4 Bulletin Board Archive

Saturday, 22 October 2011

Overlapmap

No comments:

Post a Comment

Followers