Wednesday 23 November 2011

atomic scattering factors in REFMAC

From: Ivan Shabalin
Date: 31 October 2011 15:57


Dear Refmac users,

I noticed that if I refine a structure containing SeMet, then Se atoms usually have big negative (red) peeks of difference map and high B-factors. As I understand from the diffraction theory and from some discussions at CCP4bb, that may result because in REFMAC the atomic scattering factors are internally coded for copper radiation (CuKa).
I tried to use keyword "anomalous wavelength 0.9683" and found that with this keyword I had different values of coefficient c for Se, Mn, and P as shown in REFMAC log-file:

loop_
    _atom_type_symbol
    _atom_type_scat_Cromer_Mann_a1
    _atom_type_scat_Cromer_Mann_b1
    _atom_type_scat_Cromer_Mann_a2
    _atom_type_scat_Cromer_Mann_b2
    _atom_type_scat_Cromer_Mann_a3
    _atom_type_scat_Cromer_Mann_b3
    _atom_type_scat_Cromer_Mann_a4
    _atom_type_scat_Cromer_Mann_b4
    _atom_type_scat_Cromer_Mann_c

 N     12.2126   0.0057   3.1322   9.8933   2.0125  28.9975   1.1663   0.5826 -11.5290
 C      2.3100  20.8439   1.0200  10.2075   1.5886   0.5687   0.8650  51.6512   0.2156
 H      0.4930  10.5109   0.3229  26.1257   0.1402   3.1424   0.0408  57.7997   0.0030
 O      3.0485  13.2771   2.2868   5.7011   1.5463   0.3239   0.8670  32.9089   0.2508
 SE    17.0006   2.4098   5.8196   0.2726   3.9731  15.2372   4.3543  43.8163  -1.0329
 MN    11.2819   5.3409   7.3573   0.3432   3.0193  17.8674   2.2441  83.7543   1.3834
 P      6.4345   1.9067   4.1791  27.1570   1.7800   0.5260   1.4908  68.1645   1.2650

As a result, red peeks around Se are significantly lower, Se B-factors are a bit smaller (like 25.6 and 23.1), and Rf is lowered by a bit more than 0.1% with the same input files.

That looks pretty good. Still, I want to ask your opinion on the following:

1) Is it proper way to specify atomic scattering factors? I found this keyword under REFMAC documentation topic "Simultaneous SAD experimental phasing and refinement" and Im not sure if I change something else when I specify the keyword. I dont have separate F+, F- and corresponding SIGF+, SIGF- in my mtz, so SAD experimental phasing should not go.
2) Do you think it is safe to specify this keyword for every structure under refinement? Can it have some drawbacks (except wrong wavelength)?
As I understand, the theoretical Cromer_Mann curve can be different from experimental, but still it is better than not to change scattering factor at all.

Thank you very much!!

With best regards,
Ivan Shabalin, Ph.D.
Research Associate, University of Virginia
4-224 Jordan Hall, 1340 Jefferson Park Ave.
Charlottesville, VA 22908

----------
From: Ian Tickle

Hi, personally I didn't find that changing scattering factors for Se,
Br, I etc  made a big difference to the maps.  The more likely
explanation seems to be site occupancy disorder due to
radiation-induced breaking of covalent bonds, in which case you need
to refine the occupancy.  But maybe it's a bit of both.  I don't see
any harm in specifying the correct value of f' (as long as obviously
it is the correct value!).

Cheers

-- Ian

----------
From: Craig A. Bingman

That is correct.  We saw this in every selenomethionyl protein structure that was determined at CESG.  There are two reasons for the negative density defects at Se atoms.   As you note, the default scattering factors for Se are incorrect for these experiments, as f' is large in Se SAD experiments.  Additionally, the fractional incorporation of Se is often less than 100%.  For our autoinduction media, it was fairly consistently around 90 mole percent Se.  Your mileage will vary on the last effect.  In addition to these two effects, which are particular to SeMet around the Se K edge, both regular methionine and selenomethionine have somewhat disordered sidechain conformations more often than you might guess (without anomalous diffraction data poking you in the eye and showing definitively that the Se is in more than one place.)

----------
From: Ed Pozharski


Hope others will comment to clarify my confusion:

It seems that the potential effect of correcting the structure factor
data depends strongly on how close to the edge you are:

The reduction of the overall scattering factor has a steep wavelength
dependence.  For example, the Se atom has 34 electrons, so that should
be a rough estimate of its scattering factor in the absence of
absorption.  At the very edge, f'~-8 electrons, which seems equivalent
to the ~80% occupancy, or the difference density peak on par with that
of a water molecule.  I guess it's also true that close to the edge you
will have more damage, thus the negative density.

Other words, is it possible that Ian (and others) do not see a
significant effect from correcting the scattering factors because they
do not collect close to the edge, while Ivan might have done exactly
that?

Cheers,

Ed.

--
After much deep and profound brain things inside my head,
I have decided to thank you for bringing peace to our home.
                                   Julian, King of Lemurs

----------
From: James Holton

Collecting "close to the edge" where the cross section of Se is higher does indeed increase the absorbed dose per scattered photon (Muray et al. JSR, 2005), but wavelength has absolutely no impact on the relative "rate" of Se-C bond breakage (Holton JSR 2007).  The number of Se-C bonds broken is proportional to the total amount of energy absorbed, not the atom type that initially absorbed it.  In fact, most of the Se-C bonds will break long before even a few percent of the Se atoms have been hit by a photon.

But yes, rad dam could be a reason for the negative peaks.  Formally, this is not a change in occupancy, since the Se atoms does not actually leave the crystal or otherwise vanish from the universe, it just gets a very high B factor, and the centroid of its position probably moves away from the carbon atom.  It is hard to say.  This is why lowering the occupancy is as good a model as any.

Negative peaks can also come from scaling errors.  Remember, a Se atom with B=24 is 10 e-/A^3 tall, whereas a C with the same B factor is only 1.3 e/A^3 tall.  So, if you have a ~3% error in the scale factor, it will show up on the Se atoms first, and unless you have every single atom modeled, the scale factor of Fcalc will tend to be a bit high.

Practically speaking, occupancy refinement is a perfectly good way to model all of the above phenomena.  Yes, changing the f' value is the "right" way to do it, but no doubt you've got other things going on as well, and the electron density cannot distinguish between them.  For example, if you compare the calculated electron density for an Se atom with B=24 and f' = -8 vs that of an Se with B=25.54 and occ=0.754, the curves are less than 0.1% different.  This is because when B>~10, all the details of the atomic form factor are blurred out by the much wider B-factor Gaussian.  It doesn't hurt to model the atoms form factors properly, but in almost all cases of MX, some other source of error is more important.

-James Holton
MAD Scientist

----------
From: Ed Pozharski

James,

this may be one of those physics-vs-math arguments again.  Surely the
occupancy can be used to account for everything, but it makes it a fudge
factor.  I'd say the right way is to use the adjusted scattering factors
first (after all, that is something we do know about the experiment),
and if there is still a residual density then one cannot distinguish
between radiation damage and partial Se-Met incorporation and has to
resort to occupancy, realizing that it's meaning is somewhat unclear.

Cheers,

Ed.
--
Edwin Pozharski, PhD, Assistant Professor
University of Maryland, Baltimore
----------------------------------------------
When the Way is forgotten duty and justice appear;
Then knowledge and wisdom are born along with hypocrisy.
When harmonious relationships dissolve then respect and devotion arise;
When a nation falls to chaos then loyalty and patriotism are born.
------------------------------   / Lao Tse /

----------
From: Ivan Shabalin

Thanks everybody for  the profound answers!

As a summary, I can list the following reasons for the negative density defects at Se atoms:

1) the default scattering factors for Se are incorrect for wavelengths that are not close to CuKA, even though it may be not the major source of errors. Also, it is dependent on the wavelength - closer to the peek of Se>> more impact. This factor we can correct for sure.
2) the fractional incorporation of Se is often less than 100%. And it may be difficult to estimate.
3) radiation-induced breaking of covalent bonds, which is independent of the wavelength (!!!)
4) scaling errors. "So, if you have a ~3% error in the scale factor, it
atom modeled, the scale factor of Fcalc will tend to be a bit high" - James Holton
5) both regular methionine and selenomethionine have somewhat disordered sidechain conformations, and that can result in bigger Se (S) B-factors.


There is no harm in specifying the correct value of f' (I did it through the wavelength specification), It may help a little and obviously is correct.
Also, one can model partial occupancy for Se atom. But one should understand that this can be a fudge
factor. As general conclusion, one can recommend to decrease occupancy  of Se if the red peek is very big.

I was intrigued by James Holton saying that "when B>~10, all
Does that mean, that with Bf>10 we cannot distinguish Mg and water by electron density peak profile? Even if oxygen in water has twice as much bigger radius than Mg2+? I have one more question about modelling ions and im going to ask it in a separate post.

Thanks a lot!!

With best regards,
Ivan Shabalin

----------
From: James Holton

Yup.  Pretty much.

An "Mg+2" with B=10 is almost exactly the same density profile as a
single point electron (atom type "Ano") with occ=9.72 and B=12.7.  You
can also fit "water" (an "O" with two "H" atoms on top of it) to Mg+2,
and get a pretty good fit with occ=1 and B=15 for the "water".  If you
want to play around with this, I have placed a gnuplot-ish version of
${CLIBD}/atomsf.lib at:

http://bl831.als.lbl.gov/~jamesh/pickup/all_atomff.gnuplot

in gnuplot you can type:
load 'all_atomff.gnuplot'
plot Mg_plus_2_ff(x,20), O_ff(x,15)+2*H_ff(x,15)

and stuff like that.

-James Holton
MAD Scientist

----------
From: Ivan Shabalin


Hi James!

Thank you very much for the gnuplot-ish version of ${CLIBD}/atomsf.lib!! It works very nice and is very useful for education!

As I understand, the form factor is the Fourier transform of electron charge density. It is plotted as f(electrons) vs sin(tetta)/lambda and is approximated as 5 Gaussian (Cromer and Mann) in REFMAC. And you made reverse Fourier transform of the approximation and plotted the electron density distribution in the real space.

So, can I ask, what unit is x? Is it angstrom?
And what is Y? is it e/A3 (electron density)?

I found, that at Bf=20, density profiles look almost the same for ions and atoms (Mg2+ and Mg, Cl- and Cl). Does that means, there is no sense to specify atomic charge in refmac refinement? It looks a bit strange, because the numbers of electrons are different. Or decreasing in number of electrons is compensated with significant decrease in atom size (that can have the same effect as Bf lowering)? With Bf=0 the difference in curves is significant.

----------
From: Pavel Afonine

Ivan,

it might be helpful/instructive to have a look at this review on the subject you seem to be actively interested:

Acta Cryst. (2004). A60, 19-32.

and some basic educational illustrations here:

http://phenix-online.org/newsletter/CCN_2011_01.pdf

(see "Electron density illustrations" article).

Pavel


----------
From: Pavel Afonine

Continuing on the subject, as far as I know there are at least three flavors of form-factors currently used in refinement programs:

"4 gaussian plus const":
International Tables for Crystallography (1992)

"5 gaussian plus const":
D. Waasmaier & A. Kirfel. Acta Cryst. (1995). A51, 416-431. "New analytical scattering-factor functions for free atoms and ions"

"n-gaussian" (n determined dynamically)
Grosse-Kunstleve RW, Sauter NK & Adams PD. Newsletter of the IUCr Commission on Crystallographic Computing 2004, 3:22-31. "cctbx news"

All three are available in PHENIX (the 3rd is used by default), and I presume the first one is used in CNS and Refmac, if I remember correctly (the authors of respective programs please correct me).

Pavel



----------
From: George M. Sheldrick

Just for the record, except for charge density studies most small molecule
structures are refined with neutral atom scattering factors even when ions such
as Cl- are present. For example SHELX uses "4 Gaussian plus const: International
Tables for Crystallography (1992)". Users rarely input ionic scattering factors
rather than using the default neutral atom scattering factors, possibly they are
afraid of a Category A Alert from CheckCIF if they have a charged crystal. In
practice, the difference between ionic and neutral atom scattering factors is
mostly absorbed by the displacement parameters (B-values) and the R-factors are
very similar. The main reason why routine small molecule R-values are rarely much
less than 2% is the assumption that the scattering factors are spherically
symmetrical, i.e. bonding and lone-pair electrons are ignored. This could be
addressed by the inclusion of invarioms (precalculated aspherical scattering
factors that depend on the local chemical environment of an atom) in Refmac and
Phenix Refine.

George
--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany


----------
From: Alexandre OURJOUMTSEV


Dear all,

 

Just as a historic reminder, I feel necessary to mention a key article on refinement, written by R.Aragwal (1978) in Acta Cryst A, where he used 1- and 2-gaussian atomic factors (obviously, less precise that 4 gaussians + constant but allowed him at that moment to accelerate as much as possible the calculations).

 

With best regards,

 

Sacha Urzhumtsev

 


----------
From: Eleanor Dodson


James - you are fantastic!
This is so educational..

Eleanor

----------
From: Ian Tickle


James, this doesn't take the effect of resolution cut-offs into
account, right?  You appear to be assuming that you have data to
atomic resolution (~ 1 A) or better.  The integral of the scattering
factor should be confined to the experimental resolution range,
otherwise it's not going to be very realistic, in fact the calculated
profile is not going to show the observed resolution dependence (and
where the density can go negative).  See slides 13 & 14 here
http://www.cse.scitech.ac.uk/events/CCP4_2011/talks/tickle.pdf for the
resolution-dependent version.

Cheers

-- Ian

----------
From: James Holton


Yes, in my gnuplot "form factor" functions, "x" is the real-space distance from the center of the atom in Angstrom and the "return value" is electron density in electrons/A^3.

I did not realize the gnuplot file would be so interesting!  If anyone wants the reciprocal-space version (which is simpler), it is here:
http://bl831.als.lbl.gov/~jamesh/pickup/all_atomsf.gnuplot

Where the "s" value is sin(theta)/lambda and the return value is simply "electrons".  This is because the structure factor is defined as the ratio of the scattering to the atom (or any other object) to the scattering from a single point electron at the origin (Debye & Scherrer, Phys Z. 1918; Hartree, Philo Mag. 1925).


The incorporation of a B factor is formally a convolution in real space (blurring function), which translates into a simple multiplication in reciprocal space.  The funky (4*pi/B)**1.5 factors in the real-space functions arise because the total number of electrons must not change when you apply a B factor.  This is why the peak height goes down with increasing B, and you also rapidly loose any "atomic radius" information as the width of the B-factor Gaussian becomes large when compared to the width of the relevant Ee_ff(x,0) function.


Ian has also pointed out that none of this considers the mechanics of how you actually calculate maps, where things like "series termination error" come into play.  But perhaps that is a topic for a new thread?

-James Holton
MAD Scientist

----------
From: Ivan Shabalin

Im very grateful to the community for supporting me with so interesting information!

Now my understanding of the subject is much better!


No comments:

Post a Comment