From: 商元 <shangyuan5000
Dear All,
I have a set of 3.2A data containing only 3000 reflections. From the SAD phasing and iterative modeling and density modification, I get a preliminary structure with bad geometric conformations(~8/160 ramachandran outliers in Coot). After Phenix MLHL refinement, the geometry is still bad with (10% ramachandran outliers and 25% Rotamer outliers), and the B-factors are all too high(all between 80 to 170, average ~120), and R-factor/R-free have a value of 0.328/0.326.
The poor geometry of my model and the unusual B-factors indicates there are still a lot improvement in my model. The question is, as I only have ~3000 reflections, and the atoms in the sequence is around 1000, and each atom there are 4 parameters to be refined(X,Y,Z,B-factor, assuming occupancy is 1), so how to refine my model to avoid over-refinement? Should I trust the electron-density map of the refined mtz data, or should I adjust the local geometries using Coot rotamers tools? How to set a reasonable B-factor values in the refinement?
Best Regards,
Yuan
----------
From: 商元 <shangyuan5000
Also, there is one more information I forgot to mention---I also have the NMR assignment(HNCACB spectrum) of the protein, is it possible to combine the NMR data in my refinement?
----------
From: Boaz Shaanan
Hi,
You are touching upon several issues here. The first question to ask is how good and complete are your data to 3.2 A resolution. This should be your first concern. Are they the best you can get at this stage? Second, you're absolutely correct in that there is a lot more to do to improve your model. Although it sounds as if you're on the right track, having the Rw/Rf values so close sounds alarming. Also, with low resolution data you can expect high B's but of course you should try your best to fit residues/atoms to the e.d. Setting high weights to the geometry/stereochemistry restraints in whichever refinement program you're using, can help here too, at least in the initial stages of the refinement if not throughout. If you're using the automatic weights offered by the program you may want to examine them carefully and see whether they can be modified to improve your convergence. So you have a lot to do before considering throwing in your NMR data. As for the latter, I think that there have been a few papers recently from David Baker's lab and Guy Montelione's lab where they've shown how to use rudimentary NMR data (i.e. before converting them to NOE's) in refinement of crystal structures. Which brings up the next question: are you not going to calculate NOE's from your data ? Are the NMR data that you have not sufficient to derive a good solution structure? NOE distance restraints on their own can be used to improve crystallographic structures (I can send you some old and recent references off list, if you're interested).
Cheers,
Boaz
----------
From: Zhiyi Wei
Hi Yuan,
Bad geometry is a general issue for most low resolution structure
refinement. There are quite a lot papers discussing it. I think you
can try to set a reference structure or set high restrain in
refinement, which should be easily achieved in Phenix. How did you
know the B-factors are too high? There is no standard for a single
case. B-factors varies from crystal to crystal even them have the same
molecular content. Wilson-B factor may be a good indicator. I am a
little bit worried about your R and R-free. Them are too close (Rfree
is even lower than R!). What is your space group? If you want to
reduce refining parameters to increase data/para ratio, you can try
group B-factor refinement in Phenix. For combining NMR data, you
should have NOE assignment.
Best,
Zhiyi
----------
From: Ed Pozharski
Notice that Rfree<R. This may be caused by twinning and/or NCS, as the
test set is not truly independent of the working set.
Normally, at 3.2A you would have strong geometry restraints, thus
reducing the effective number of degrees of freedom to perhaps as few as
two per residue.
Not sure what you mean by this. Given the resolution and overall B
values, it is likely that you won't have strong enough electron density
to place many side chains. In which case you should omit the atoms
unsupported by electron density from your model. As a general rule, at
3.2A you should be able to trace the backbone and place some sidechains.
The B-factor values are what they are, you cannot "set" them to a
reasonable value of your choice. It is not entirely unusual to see
B~100 at 3.2A, so don't worry too much about that.
Cheers,
Ed.
--
"Hurry up before we all come back to our senses!"
Julian, King of Lemurs
----------
From: Pete Meyer
As others have mentioned, your geometry/x-ray weight may need to be
adjusted.
However, at 3.2 Angstroms I'd recommend against using atomic B-factors -
the "rule of thumb" for this is 2.8 Angstroms for atomic B-factors (or
at least it was back in the day). It might help to use an overall
B-factor combined with one (or a few) TLS groups.
Regarding how far to trust the density from the refined model - that's
what (composite) omit maps are for.
Good luck,
Pete
----------
From: Nat Echols
2012/1/6 Pete Meyer
This may be true for older software which restraints B-factors only to
bonded atoms, but it is not the case in Phenix*, which takes into
account all nearby atoms, not just bonded ones. The result is that
individual B-factor refinement is very stable at low resolution - we
don't know what the limit is, but it routinely works very well at 4A.
Of course the performance is still dependent on solvent content, NCS,
etc., but it is very rare that grouped B-factor refinement actually
works better.
-Nat
* I think Refmac may do something similar, but I haven't tried this
recently. I would be very surprised if it did not work well at 3.2A,
however.
----------
From: Yuri Pompeu
First thing I would try to shoot more crystals. Easy way out. I once struggled with a 2.7A data set for weeks only to find out I had a 1.5A diffracting crystal taking a bath in some storage buffer right next to my bench.
You mention you have at this point you are looking at 25% rotamer outliers. I wonder if at 3.2A if you really have density for these side chains. You may be trying to fit these side chains in very weak unreliable (e.g. noise) electron density. As Ed Pozharski suggested, omitting these may be the right thing to do.
How certain are you of your space group and how did you generate you Rfree test set. You should have Rwork>Rfree after refining your model.
HTH
Yuri
----------
From: Yuri Pompeu
correction:
You should NOT have Rwork>Rfree
----------
From: Pete Meyer
B-factor refinement being stable is one thing; quieting my paranoia regarding over-fitting at low resolutions is another.
Thanks for pointing this out to me - I'll have to check out the details of how phenix handles it, and give it a try.
Pete
----------
From: Pavel Afonine
Details can be found here:
(page 61 and around)
and here:
(see article "On atomic displacement parameters...").
Pavel
----------
From: Ethan Merritt
Unfortunately, "stable" and "statistically correct" are two very different
criteria. It is quite possible to have a stable refinement that produces
nonsensical, or at least unjustifiable, B factors. Actually this caveat
applies to things other than B factors as well, but I'll stay on topic.
At last year's CCP4 Study Weekend I presented a statistical approach to
deciding what treatment of B could be justified at various resolutions.
"To B or not to B?" The presentations from that meeting should appear in a
special issue of Acta D soon.
Based on the set of representative cases I have examined, I am willing
to bet that with the limited obs/parameter ratio in the case at hand,
a model with individual Bs would turn out to be statistically unjustified
even if the refinement is "stable". A TLS model is more likely to be
appropriate.
cheers,
Ethan
--
Ethan A Merritt
----------
From: Ed Pozharski
A quick clarification request if I may:
We all seen how well the multi-group TLS models seem to match the
B-factor variation along the chain. Is this in your opinion how such
model may be really effective, by incorporating most of the B-factor
variation into ~100 TLS parameters?
And a question:
Given that the B-factors for side chain atoms will be generally higher,
do you know if creating two separate sets of TLS parameters for
backbone / side chains improves things?
Thanks,
Ed.
--
Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
Julian, King of Lemurs
----------
From: Ethan Merritt
I have run statistical analysis of alternative models on various
structures in the resolution range 2.8 - 4.0 A. For some of these,
I found that a full 1-Biso-per-atom model was indeed statistically
justified. For most, however, a TLS model was better. For some,
a hybrid Biso + TLS model was better than either alone. So this really
should be decided on a case by case basis rather than trying to come
up with a single "rule of thumb".
Now as to how many TLS groups a model should be partitioned into, that
varies all over the place and is clearly a consequence of the individual
lattice packing. For some structures with loose packing (as I interpret
the cause), a single-group TLS model with uniform constant per-atom B
is significantly better than a model with a separate B factor for each
atom but no TLS component. Adding additional TLS groups does not actually
help that much. To me this means that the largest factor contributing to
the ADPs is the overall displacement of the whole molecule within the
lattice, which is strongly anisotropic. The single-group TLS model
describes this anisotropy well enough, while any number of isotropic B
factors does not.
Those cases where the individual B factor option tests out best correspond,
as I interpret it, to relatively rigid lattice packing. In these crystals
the overall anisotropy is very low, so TLS models are not the right
formalism to use in describing the distribution of ADPs. Perhaps
normal-mode models would be better; it is hard to draw conclusions from
the very small number of normal-mode refinements reported to date.
That is a question that I am currently working on. I don't think that
two sets of TLS parameters will turn out to be a good way to handle it.
I am more attracted to the idea of applying a TLS description on top of
a fixed a priori model for B variation along the sidechain. This
approach is inspired by the per-amino acid targets for varying B along
the sidechain that were developed by Dale Tronrud for use in TNT.
cheers,
Ethan
> Thanks,
>
> Ed.
----------
From: Eleanor Dodson
1) You say 3000 reflections - 400 parameters, but you arent including the restraints as "observations" - refinement and rebuilding IS possible at 3.2A..
2) B factors not unreasonable for the data.. You may want to use fixed Bs with TLS or restrain Bs strictly..
3) How can your FreeR and r factor be so close - that MUST be wrong..
4) You can tighten the geometry with coot or with stricter restraints. Maybe set the occs of the bad residues to 0.00 and see what they look like after refining the other better determined parts? They could then perhaps be rebuilt with more acceptable geometry.
Eleanor----------
From: Clemens Vonrhein
Yes - we did something a few years back for the structure of the human
voltage-dependent anion channel (slightly more reflections, but lower
resolution) using a combination of Se-MET phases (SHARP), NMR and
secondary-structure restraints in refinement (BUSTER). See
Monika Bayrhuber, Thomas Meins, Michael Habeck, Stefan Becker, Karin
Giller, Saskia Villinger, Clemens Vonrhein, Christian Griesinger,
Markus Zweckstetter, Kornelius Zeth(2008): Structure of the human
voltage dependent anion channel. Proc. Nat. Acad. Sci. USA 105:
15370-15375.
or
http://www.pnas.org/content/105/40/15370.full It contains a fair amount of background info about methods.
Cheers
Clemens
>
> Regards,
--
***************************************************************
* Clemens Vonrhein, Ph.D.