Saturday 29 October 2011

Ice rings... [maps and missing reflections]

From: Ed Pozharski
Date: 11 October 2011 18:34


On Tue, 2011-10-11 at 15:24 +0000, Bruno KLAHOLZ wrote:
> However, once you have determined and refined your structure it may be
> worth predicting the intensity of these spots and put them back for
> map calculation,

REFMAC does this by default, because

"expected value of unknown structure factors for missing reflections are
better approximated using DFc than with 0 values."

CNS defaults to excluding them.  As for phenix, I am not entirely sure -
it seems that phenix.refine does too (fill_missing_f_obs= False), but if
you use the GUI then the fill in option is turned on.



--
Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
                                               Julian, King of Lemurs

----------
From: Nat Echols


In practice, it will be turned on for command-line phenix.refine too if you don't supply your own custom map definitions - actually it produces both "filled" and "unfilled" maps, but the former is what most users will see in Coot.

-Nat

----------
From: Pavel Afonine



better, but not always. What about say 80% or so complete dataset? Filling in 20% of Fcalc (or DFcalc or bin-averaged <Fobs> or else - it doesn't matter, since the phase will dominate anyway) will highly bias the map towards the model. Clearly there are cases where filling in a few missing reflections significantly improves map interpretability without introducing any bias. 
phenix.refine always outputs two 2mFo-DFc maps: one is computed using the original set of Fobs, and the other one is computed using set of Fobs where missing reflections filled in with DFc calculated using well determined atoms only. By default, Coot will open the "filled" one.

Pavel


----------
From: Ed Pozharski


DFc, if properly calculated, is the maximum likelihood estimate of the
observed amplitude.  I'd say that 0 is by far the worst possible
estimate, as Fobs are really never exactly zero.  Not sure what the
situation would be when it's better to use Fo=0, perhaps if the model is
grossly incorrect?  But in that case the completeness may be the least
of my worries.

Indeed, phases drive most of the model bias, not amplitudes.  If model
is good and phases are good then the DFc will be a much better estimate
than zero.  If model is bad and phases are bad then filling in missing
reflections will not increase bias too much.  But replacing them with
zeros will introduce extra noise.  In particular, the ice rings may mess
things up and cause ripples.

On a practical side, one can always compare the maps with and without
missing reflections.

--
After much deep and profound brain things inside my head,
I have decided to thank you for bringing peace to our home.
                                   Julian, King of Lemurs

----------
From: Pavel Afonine


Hi Ed,

Yes, that's all true about what is DFc. In terms of missing-Fobs-filling it's not too important (as map appearance concerned) which values you take, DFc, <Fobs> , etc. I spent a few days playing with this some years ago.
Yep, that was the point - sometimes it is good to do, and sometimes it is not, and ...
... this is why phenix.refine outputs both maps -:)

All the best,
Pavel


----------
From: Ed Pozharski

Do you have a real life example of Fobs=0 being better?  You make it
sound as if it's 50/50 situation.

--
"Hurry up before we all come back to our senses!"
                          Julian, King of Lemurs

----------
From: Pavel Afonine




Hopefully, there will be a paper some time soon discussing all this - we work on this right now.
No (sorry if what I wrote sounded that misleading).  

Pavel


----------
From: Randy Read


If the model is really bad and sigmaA is estimated properly, then sigmaA will be close to zero so that D (sigmaA times a scale factor) will be close to zero.  So in the limit of a completely useless model, the two methods of map calculation converge.

Regards,

Randy Read
------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     
Wellcome Trust/MRC Building                  
Hills Road                                   
Cambridge CB2 0XY, U.K.                       www-structmed.cimr.cam.ac.uk

----------
From: Garib N Murshudov

In the limit yes. however limit is when we do not have solution, i.e. when model errors are very large.  In the limit map coefficients will be 0 even for 2mFo-DFc maps. In refinement we have some model. At the moment we have choice between 0 and DFc. 0 is not the best estimate as Ed rightly points out. We replace (I am sorry for self promotion, nevertheless: Murshudov et al, 1997) "absent" reflection with DFc, but it introduces bias. Bias becomes stronger as the number of "absent" reflections become larger. We need better way of estimating "unobserved" reflections. In statistics there are few appraoches. None of them is full proof, all of them are computationally expensive. One of the techniques is called multiple imputation. It may give better refinement behaviour and less biased map. Another one is integration over all errors (too many parameters for numerical integration, and there is no closed form formula) of model as well as experimental data. This would give less biased map with more pronounced signal.

Regards
Garib

Garib N Murshudov 
Structural Studies Division
MRC Laboratory of Molecular Biology
Hills Road 
Cambridge 
CB2 0QH UK





----------
From: Ethan Merritt

I don't quite follow how one would generate multiple imputations in this case.

Would this be equivalent to generating a map from (Nobs - N) refls, then
filling in F_estimate for those N refls by back-transforming the map?
Sort of like phase extension, except generating new Fs rather than new phases?

       Ethan
--
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

----------
From: Dale Tronrud


  Unless you do some density modification you'll just get back zeros for
the reflections you didn't enter.

Dale

----------
From: Garib N Murshudov

Best way would be to generate from probability distributions derived after refinement, but it has a problem that you need to integrate over all errors. Another, simpler way would be generate using Wilson distribution multiple times and do refinement multiple times and average results. I have not done any tests but on paper it looks like a sensible procedure.

regards
Garib

----------
From: Ethan Merritt

Dale Tronrud wrote>
Sure.  And different DM procedures would give you different imputations,
or at least that was my vague idea.

Garib N Murshudov wrote>
OK.  That makes sense.

               Ethan

----------
From: Eleanor Dodson


Here we are I presume only worried about strong reflections lost behind an ice ring. At least that is where the discussion began.

Isnt the best approach t  this problem to use integration software which attempts to give a measurement, albeit with a high error estimate?

The discussion has strayed into what to do with incomplete data sets..
In these cases there might be something to learn from the Free Lunch ideas used in ACORN and SHELX and other programs - set the missing reflections to E=1, and normalise them properly to an appropriate amplitude.

Eleanor

----------
From: Tim Gruene


Some people call this the "free-lunch-algorithm" ;-)
Tim

>       Ethan
> [...]
- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen




----------
From: Edward A. Berry


Doesn't work- the Fourier transform is invertable. As someone already said in this
thread, if the map was made with coefficients of zero for certain reflections
(which is equivalent to omitting those reflections) The back-transform will
give zero for those reflections. Unless you do some density modification first.
So free-lunch is a good name- there aint no such thing!

----------
From: Ethan Merritt

Tim refers to the procedure described in
 Sheldrick, G. M. (2002). Z. Kristallogr. 217, 644–65

which was later incorporated into shelxe as the Free Lunch Algorithm.
It does indeed involve a form of density modification.
Tim is also correct that this procedure is the precedent I had in mind,
although I had forgotten its clever name.

       cheers,

----------
From: George M. Sheldrick


Dear Ethan,

Thankyou for the reference, but actually it's the wrong paper and anyway
my only contribution to the 'free lunch algorithm' was to name it (in the
title of the paper by Uson et al., Acta Cryst. (2007) D63, 1069-1074). By
that time the method was already being used in ACORN and by the Bari group,
who were the first to describe it in print (Caliandro et al., Acta Cryst.
Acta Cryst. (2005) D61, 556-565). As you correctly say, it only makes sense
in the context of density modification, but under favorable conditions,
i.e. native data to 2A or better, inventing data to a resolution that you
would have liked to collect but didn't can make a dramatic improvement to
a map, as SHELXE has often demonstrated. Hence the name. And of course
there is no such thing as a free lunch!

Best regards, George
--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany


----------
From: Tim Gruene


I am glad the structures that have been solved using the
free-lunch-algorithm as implemented in shelxe did not know they were not
allowed to be solved. Of course there is DM involved, as has been
pointed out ;-)
i-

----------
From: James Holton


Indeed we do!  Because this appears to be the sum total of how the correctness of the structure is judged.  It is easy to forget I think that from the "point of view" of the refinement program, all reflections flagged as belonging to the "free" set are, in effect, "missing".   So Rfree is really just a score for how well DFc agrees with Fobs?

-James Holton
MAD Scientist


No comments:

Post a Comment