Thursday 20 October 2011

Calculate real-space R-factor/corr coeff for ligand

From: Brigitte Ziervogel
Date: 3 October 2011 20:32

Hi,

I am trying to calculate real-space R-factors and correlation coefficients for an array of different ligand conformations to find out which fits best in experimental density.  So far, I have been trying to use Overlapmap in CCP4 6.1.2 to do this, by correlating maps by residue and selecting the list a real-space R-factor option.  I would like to compare a map with the ligand omitted to maps calculated with each ligand conformer.

I am supplying Overlapmap with a refmac mtz file calculated without ligand in the model for map 1 and a pdb file that contains both protein and ligand coordinates to calculate map 2.  However, I'm confused about the output.  For the protein, which I know is well-defined and modeled correctly in the density, I see mostly reasonable correlation coefficients, ~0.9, but the real-space R-factor values are all over the place and range from zero to hundreds.  For example, for one residue the correlation coefficient is 0.8309 with an R-factor of 210.333.  I am very confused about how to interpret these values.  Has anyone else tried to use Overlap for a similar purpose and could give suggestions as to what I'm doing wrong?  Thanks!

Brigitte

----------
From: Adam Ralph


Dear Brigitte,


     Looking at the formulae it could be possible to get those results. Take an example 
below


    Rho_cal = -0.11, 0.0, 0.05, 0.05
    Rho_obs = -0.08, 0.01, 0.04, 0.04


    R-fac = 0.02/0.0   =   undefined


    Correl =      0.0032 - (-0.0025*0.0025)
                   --------------------------------------     = 0.99
                            sqrt(0.0043 * 0.0024)




Did the calculations quickly so hope they are OK. However, I designed the data so 
that the denominator in the R-fac is zero i.e. the sum of Rho_cal = - sum of Rho_obs. 
It would imply that the ATMMAP from sfall does not cover the correct set of grid points
for the ligand. You expect the Fc map to be positive in this region. You need to generate 
a new ATMMAP for each different ligand conformation.


Adam 

----------
From: Ian Tickle



Hi Adam

That doesn't look right to me, the formula according to Jones et al is:

                     RSR = sum(| rho_obs - rho_calc |) / sum(| rho_obs + rho_calc |)

So for your example we have RSR = (.03 + .01 + .01 + .01) / (.19 + .01 + .09 + .09) = .13 which is obviously quite a reasonable number.

If you want some numbers which will cause a zero divide you have to make rho_obs = - rho_calc for every point so each term in the sum in the denominator above is zero, and therefore obviously the denominator itself would be zero.

Here are the relevant code snippets from OVERLAPMAP:
 
                     iave(j,i)=0
                     xave(j,i)=0.
                     yave(j,i)=0.

                     iave(jj,ii)=iave(jj,ii)+1
                     xave(jj,ii)=xave(jj,ii)+xwork
                     yave(jj,ii)=yave(jj,ii)+ywork
 
                     xave(jj,ii)=xave(jj,ii)/iave(jj,ii)
                     yave(jj,ii)=yave(jj,ii)/iave(jj,ii)

                     rfac(jj,ii) =  (abs(xave(jj,ii)- yave(jj,ii))) / (abs(xave(jj,ii)+ yave(jj,ii)))

This looks wrong to me since the absolute value is being taken after the summation instead of before, i.e. it should be forming sums of abs(xwork-ywork) and abs(xwork+ywork).  The absolute value of a sum is not the same as the sum of absoiute values!  Note that the division throughout by the no of points (iave(jj,ii)) has no effect on the result.

I didn't check the formula for the correlation coefficient.

But your broad conclusion (that the data is garbage) is very probably correct!

Cheers

-- Ian

----------
From: Ian Tickle



Ooops (.03+.01+.01+.01)/(.19+.01+.09+.09) = .16

-- Ian

----------
From: Ian Tickle



On Tue, Oct 4, 2011 at 7:14 PM, <bkziervogel@uchicago.edu> wrote:
Hi Adam and Ian,

Thanks for your help.  If I re-calculate the R-factors with the correct absolute values I get more reasonable values.  However, I'm still a bit confused because the output given by the Overlapmap program is structure factor values, which are used to calculate the real-space R-factors.  Should this not be Rho values instead?  Additionally, a lot of my structure factors, even for the protein, which I know fits well within the experimental density, are 0 for the sidechains or negative.  Any idea what's going on here?  I've attached some sample data from the Overlapmap output file.  Thanks in advance.

Brigitte




Hi Brigitte

Yes I agree with you that the output is very confusing!  I don't know exactly what 'Fobs' & 'Fcalc' are but I'm pretty sure they can't be structure factors.  I would guess that 'F' is actually rho (i.e. rho calculated by FFT from F).  According to the man page overlapmap only inputs & outputs maps: nowhere does it mention reading or writing SFs.  Very confusing! 

Also why the values are small or negative, I've no idea as I've never used overlapmap.  I would keep posting to the BB in the hope that someone can solve your problem.

Cheers

-- Ian

----------
From: Brigitte Ziervogel


Thanks Ian, I'll keep posting :)

Brigitte


No comments:

Post a Comment