Tuesday, 10 April 2012

Disorder or poor phases?

From: Francis E Reyes
Date: 10 April 2012 16:22

Hi all,

Assume that the diffraction resolution is low (say 3.0A or worse) and the model (a high resolution homologue, from 2A xray data is available) was docked into experimental phases (say 4A or worse) and extended to the 3.0A data using refinement (the high resolution model as a source of restraints). There are some conformational differences between the high resolution model and the target crystal.

The author observes that in the 2fofc map at 3A, most of the model shows reasonable density, but for a stretch of backbone the density is weak.

Is the weakness of the density in this region because of disorder or bad model phases?


Would love people's thoughts on this one,

F


---------------------------------------------
Francis E. Reyes M.Sc.

----------
From: Tim Gruene 


Dear Francis,

the phases calculated from the model affect the whole unit cell hence it
is more likely this is real(-space, local) disorder rather than poor
phases.

Regards,
Tim

P.S.: The author should not look at an 2fofc-map but a
sigma-A-weighted map to reduce model bias.
- --
- --
Dr Tim Gruene

----------
From: Dale Tronrud
  The phases do have effects all over the unit cell but that does not
prevent them from constructively and destructively interfering with one
another in particular locations.  Some years ago I refined a model of
the bacteriochlorophyll containing protein to a 1.9 A data set when the
sequence of that protein was unknown.  This is primarily a beta sheet
protein and a number of the loops between the strands were disordered.
Later the amino acid sequence was determined and I finished the refinement
after building in these corrections.  The same data set was used, but
a number of the loops had become "ordered".  While the earlier model
(3BCL) had 357 amino acids the final model (4BCL) had 366.

  These nine amino acids didn't become ordered over the intervening
years.  They were just as ordered when I was building w/o a sequence,
it is just that I couldn't see how to build them based on the map's
appearance.

  One possibility is that the density for these residues was weak
and the noise (that was uniform over the entire map) obliterated their
signal where it only obscured the stronger density.  Another possibility
is that the better model had a better match of the low resolution F's
and less intense ripples radiating from the surface of the molecule,
resulting in things "sticking out" being less affected.

  Whatever the details, the density for these amino acids were too
weak to model with the poorer model phases and became buildable with
better phases.  The fact that they could not be seen in the early map
was not an indication that they were "disordered".

  The first six amino acids of this protein have never been seen in
any map, including the 1.3 A resolution model 3EOJ (which by all rights
should have been called 5BCL ;-) ).  These residues appear to be truly
disordered.  Going back to 3BCL - The map for this model is missing
density for a number of residues of which we know some are disordered
and some simply unmodelable because of the low quality of the phases.
I don't know of a way, looking at that map alone, of deciding which
is which.  Because of this observation I don't believe it is supportable
to say "I don't see density for these atoms therefore they must be
disordered."  Additional evidence is required.

Dale Tronrud


----------
From: Francis E Reyes
Dale

Thank you for the case study. I will certainly remember it when I next see:

> "I don't see density for these atoms therefore they must be
> disordered."

You do mention though, that when you were able to assign the sequence to the beta sheets, that the loop regions became clear.

I consider the case (which a majority of cases seem to be), where the author has built and sequence assigned 95% of the ASU, but is unable to model a loop region. One possibility is that the loop is truly disordered (95% of the ASU is built and is presumably right), the other possibility is that there's an inherent error in the existing structure that is affecting the interpretation of the loop region. The errors are probably extremely subtle and distributed throughout the model (think of the improvements DEN refinement gave for the rerefinement of p97).

I guess in either case, because of the dependency of the map on the existing set of phases it's difficult to determine whether it's truly disordered or not.


>
> P.S.: The author should not look at an 2fofc-map but a
> sigma-A-weighted map to reduce model bias.
Tim,

I assume a sigmaA weighted 2Fo-Fc map (which I believe is the default for most crystallographic refinement packages).

----------
From: Gerard Bricogne
Dear Dale,

    There is perhaps a third factor in the progress you were able to make,
namely the improvement in the refinement programs. Your first results were
probably obtained with a least-squares-based program, while the more recent
would have come from maximum-likelihood-based ones. The difference lies in
the quality of the phase information produced from the model through
comparison of Fo and Fc, with much greater bias-correction capabilities in
the ML approach. Here, it removed the bias towards some regions being absent
in the model, and made them no longer be absent in the maps. So it is a
question of the quality of the phase information.


    With best wishes,

         Gerard.

--

----------
From: Dale Tronrud
Dear Gerard,

  No, the updated model (4BCL) was published in 1993 (although apparently
not deposited until 1998 - What was wrong with me?)  Both were refined
with that classic least-squares program TNT.  I hope there was some
improvement in the software between 1986 and 1993, and I always tried to
work with the most recent version, but there wasn't a switch in target
function.

  I agree that the distortions in these maps would have been less if
an ML approach had been used and perhaps the location of the "disordered"
residues would have been apparent earlier in the process.   Maybe this
sort of problem will not be seen again at 1.9 A resolution.  My goal was
simply to provide an example where errors due to model phases didn't
distribute evenly throughout the map but had greater consequence in some
locations.

Dale


----------
From: Kay Diederichs
Hi Dale,

my experience is that high-B regions may become "visible" in maps only late in refinement. So my answer to the original poster would be - "both global reciprocal-space (phase quality) and local real-space (high mobility) features contribute to a region not appearing ordered in the map". This would be supported by your experience if those residues that you could not model in 3BCL had high (or at least higher) B-factors compared to the rest of the model. Is that so?

best,

Kay

----------
From: Dale Tronrud
On 4/10/2012 10:44 PM, Kay Diederichs wrote:
Hi Dale,

my experience is that high-B regions may become "visible" in maps only late in refinement. So my answer to the original poster would be - "both global reciprocal-space (phase quality) and local real-space (high mobility) features contribute to a region not appearing ordered in the map". This would be supported by your experience if those residues that you could not model in 3BCL had high (or at least higher) B-factors compared to the rest of the model. Is that so?

  Actually the residues I couldn't model in 3BCL had no B's... :)

  Seriously, the residues that appeared for 4BCL did have B values much
higher than average.  Their density was weak in the best of circumstances
and more susceptible to obliteration by the distortions caused by
imprecision in the phases.  I don't really want to describe this as "phase
error" as that phrase conjures notions of large changes in phase.  The
R value only dropped from 18.9% to 17.8% from 3BCL to 4BCL.  I don't
expect there were huge differences in the phase angles, but the differences
were enough.

Dale
best,

Kay

----------
From: Jens Kaiser
Hello,
 Kay is absolutely right. Just to make this clear: We all know that in
many cases, you start out with poor phases (i.e. a weak SIR/MIR/MAD or a
borderline replacement model) and your density is "modest". The prudent
thing to do at this stage is, to build only things you trust and have a
look at the improved density. Well, we all know also, that an improved
density means in most cases a density with improved phases.
 The term "disorder" means, a region of higher uncertainty. Logically,
the more information you have (more actual data points - i.e.
reflections == resolution/completeness; more reliable Fs; etc.; _better
phases_) the better you can pinpoint these areas.
 The phase is a magnitude we cannot measure, but that affects the
density the most. We determine it through "refinement" (which
encompasses density interpretation and computational optimization of
atomic parameters with regards to the reflection data).
 Gedankenexperiment: If you collect data on a crystal, let's say on a
sealed tube from 1950 with a photon counter, and you collect the same
data from the same crystal on a modern synchrotron with a PAD, you might
find certain areas of your molecule "disordered" that you might be able
to interpret with (more) data collected from the "better collected
data". Probably more so - if you have the same amount of data and poorer
or better phases, you have a similar problem.
 My point being: the term "disorder" is related to the amount of data
you have (be it collected (I's) or deduced (phi's)). With very few
exceptions (see for example the paper for 1M1N), it's not the method
(diffraction) that tricks us, it's just the amount of information that
we have, that prevents us from building "complete" models. Most
importantly, the term "disordered" - as used in macromolecular
terminology - depends on resolution /and/ quality of the phases. (As a
side note: What we call "alternative conformations" in macromolecular
crystallography is called "disorder" in small molecule crystallography.
I don't know what the SM word for the MM "disorder" is...)

Cheers,

Jens

----------
From: Bernhard Rupp
I am not sure anyone looks at plain 2fo-fc maps anymore  - it almost always (at least since the beginning of the 3rd millennium)  implies 2mfo-Dfc ML maps. Detailed explanation of coefficients and their relation to ML sigma A are in R.Read papers, BMC, Bricogne, etc pp
BR,
--
-----------------------------------------------------------------
Bernhard Rupp


----------
From: James Holton
Francis,

I think in the cases you describe the region in question is disordered.  Time and time again I have users coming to my beamline wanting to clean up a "questionable region" by getting experimental phases.  Ahh!  If only I had a nickle for each one.  Oh wait, I suppose I kind of do?  I take that back!  Go MAD everyone!

Much as I hate to discourage people from using my favorite technique, Tim is right: phases are not region-specific in electron density maps.  Dale does make a good point that there is such a thing as "model bias" and one can argue that experimental phases don't have it.  But, this is only true if you have not yet applied solvent flattening.  How long has it been since you looked at a "raw" experimentally-phased map (before solvent flattening)?  I'm willing to bet a while.  With very few exceptions, raw experimental phases are lousy.  We have actually become quite dependent on density modification to clean them up.  In fact, solvent flattening is the only reason why SAD works at all.

However, you CAN use anomalous differences to clear up disordered regions in a different way.  Something I started calling "SeMet scanning" a number of years ago.  A few of my users have done this, and a good example of it is Figure 3 of Huang et al. 2004 (doi:10.1038/nsmb826).  Basically, you mutate residues in the disordered region one at a time to SeMet, and look at phased anomalous difference Fourier (PADF) maps.  These maps are surprisingly clear, even when the anomalous difference signal is so weak as to make experimental phasing hopeless.  Yes, the best phases to use for PADF maps are model phases, but, as always, it is prudent to refine the model after omitting the thing you are looking for before calculating such phases.

Another way to get residue-specific labeling for low-resolution chain tracing is radiation damage.  If you expose for the right amount of time, Asp and Glu side chains will be specifically "burnt off", but not Asn and Gln.  You will also see Met loosing its head, etc.  So, as long as you have read Burmeister (2000), an Fo-Fo map of damaged vs undamaged can be used to guide sequence assignment, even at 4.5 A and worse.

Anyway, when it comes to the question of "is it disordered or is it model bias?", I think it is usually the former.  It is very difficult to make "model bias" suppress a region that is actually well-ordered.  Try it!  After all, this is the whole reason why we bother looking at fo-fc maps.  Then again, it is always possible to have a model so bad that the phase error is enough to squash anything.  An excellent example of this can be found in the Book of Fourier.  Taking amplitudes from the image of a cat, you can see what happens when you use the phases of a duck:
http://www.ysbl.york.ac.uk/~cowtan/fourier/picduckcatfft.gif
as opposed to what happens if you use the phases of a manx:
http://www.ysbl.york.ac.uk/~cowtan/fourier/piccatmanx2.gif
A manx is a species of cat that doesn't have a tail, so no animals were harmed in obtaining these phases.  My point here is that the cat's tail can be seen quite readily in the 2fo-fc map if most of the structure is already "right", but if your model is completely unrelated to the true structure (fitting a duck into a cat-shaped hole), then everything is "in the noise".

Real structures are usually somewhere between these two extremes, and I think an important shortcoming in modern crystallography is that we don't have a good quantitative description of this middle-ground.  We all like to think we know what "model bias" is, but we don't exactly have "units" for it.  Should we be using a scale of 0 to 1?  Or perhaps "duck" to "cat"?  Yes, I know we have "figure of merit", but FOM is not region-specific.

 In my experience, as long as you have ~50% of the electrons in the "right" place (and none of them in the "wrong" place), then you can generally trust that the biggest difference feature in the fo-fc map is "real", and build from there.  As the model becomes more complete, the phases should continue to get better, not worse.  Eventually, this does break down, although I'm not really sure why.  With small molecules, the maximum fo-fc peak keeps getting bigger (on a sigma scale) as you add more and more atoms, and the biggest one you will ever see is the last one.  For macromolecules, the difference features keep getting smaller and smaller as you build.  Perhaps small errors (like non-Gaussian atomic displacement distributions being modeled as Gaussians) slowly accumulate?  Perhaps there are other sources of systematic error that we don't yet fully understand?  Eventually, for whatever reason, you stop building.  Having electrons in the wrong place is about twice as bad as not having them at all, which I think is why we trim models so aggressively for molecular replacement, and also why we are so reticent to model in things that we are not "sure" about.  Disordered regions, of course, will always lie at a lower level of electrons/A^3 than ordered regions, and therefore will be the last things to show up as the "top peak" in the fo-fc map.  They will also be the last things to poke their heads above the 1-sigma contour in a 2fo-fc map, but that does not mean they are "not there".  You can lower the map contour and see them easily enough (even in 3bcl).  The trick is having some kind of statistically-sound rule for "being sure".  Otherwise, we might start seeing "map contour creep", just as we currently see "R factor creep" in high-profile journals.

-James Holton
MAD Scientist


----------
From: Gloria Borgstahl
a recent experience in our lab with molecular replacement (wt and
disordered point mutant; same space group and unit cell)
was solved with a combination of two methods.

1.  We made omit maps in the disordered region at several lower
resolutions.  The region became interpretable after suffereing through
these maps, building residue by residue and refinement.
2.  Then we had the bright idea to make Fwt-Fmutant maps to confirm
our interpretation.  Happily this map did confirm the unexpected large
structural changed caused by a point mutant.

----------
From: Eleanor Dodson
  Nothing profound to add to this interesting discussion, but I too would like to plug
FobsA - FobsB type maps - when A and B are similar but not quite the same.. 
It is prudent to omit the interesting parts of model A (or B) - whichever you use to calculate the PHIC and FOM -
but the peaks and pits often clear up ambiguity  brilliantly.

You need to CAD the two data sets together, and make sure both Fobs are on the same scale..
Eleanor


No comments:

Post a Comment