From: Phil Evans
Date: 12 October 2011 14:55
I've been struggling a bit to understand the definition of B-factors, particularly anisotropic Bs, and I think I've finally more-or-less got my head around the various definitions of B, U, beta etc, but one thing puzzles me.
It seems to me that the natural measure of length in reciprocal space is d* = 1/d = 2 sin theta/lambda
but the "conventional" term for B-factor in the structure factor expression is exp(-B s^2) where s = sin theta/lambda = d*/2 ie exp(-B (d*/2)^2)
Why not exp (-B' d*^2) which would seem more sensible? (B' = B/4) Why the factor of 4?
Or should we just get used to U instead?
My guess is that it is a historical accident (or relic), ie that is the definition because that's the way it is
Does anyone understand where this comes from?
Phil
----------
From: Eleanor Dodson
Not sure if this is helpful Phil, but SCALEIT output includes various definitions taken from the Willis and Prior book.
But then there is the problem of converting the amplitude B factors to real space..
I attach my anisotropy notes..
It doesnt address the ? of sensible conventions!!
E
----------
From: Ian Tickle
Hi Phil
My understanding is that when the B factor was devised it was believed
that it wouldn't represent any physical reality and was initially at
least widely regarded as a "garbage dump" for errors. So it made no
difference whether or not it was related to the natural length in
reciprocal space, it was just a number, a "fudge factor" used to fit
the data. Bs^2 is simplest to calculate from theta (which can be
measured directly from the film or diffractometer setting), lambda
(which is fixed of course) and B - particularly if you don't have a
computer! Also a significant point may be that the scattering factors
were tabulated as a function of s=sin(theta)/lambda (but you could
equally well ask why 2sin(theta)/lambda wasn't used there). So it's
more convenient to have B multiplying s^2 since you can simply add B
to the constant part of the Gaussian scattering factor function. Of
course they could have absorbed the extra factor of 2 into lambda
(i.e. use lambda/2 instead of lambda) but maybe no-one thought of
that!
U, the mean square displacement, is the quantity which is directly
related to the physics so if it's realism you're after, use U, not B
(or beta).
Cheers
-- Ian
----------
From: James Holton
The weird scale factor on what we now call "B" comes from Debye's derivation of it (Debye 1914). This derivation is given much more succinctly in Bragg's textbook (since edited by R. W. James (1962) page 22, which I attach). The "kappa" used therein is the magnitude of the incident radiation's wave vector (2*pi/lambda), so there is one factor of 2*pi. The scattering vector "S" is the difference between the unit vectors of the incident and scattered beams, so it is 2*sin(theta) long: another factor of two. Then you can see in equation 1.27 that the second term in the expansion is divided by 2 and squared. The mathematical "reason" for this is similar to the rules for taking a derivative, and equation 1.27 is really just a Taylor expansion in the exponent. You can also think of it as Debye assuming that the atoms move like simple harmonic oscillators.
So, the sum total of all the weird scale factors is to multiply by 2 to convert from d* to sin(theta)/lambda, multiply by 2*pi because the wavevector is in radians, then square it and divide by two because "B" is the second term in a Taylor series. This gives a final scale factor of (2*2*pi)^2/2, or:
B = 8*pi^2*<u_x^2>
where "u" is the atomic displacement vector and u_x is the component of that vector normal to the Bragg plane. Remarkably, movement in the other two directions doesn't change the spot intensity.
You may also note that R.W. James (1962) does not explicitly use "B", but rather M = B*(sin(theta)/lambda)^2. The quantity exp(-2*M) is what is traditionally known as the "Debye factor". The "2" in exp(-2*M) puts it on an intensity scale, just like the polarization factor, Lorentz factor and absorption factor. The splitting of "M" into "B" and "(sin(theta)/lambda)^2" was probably done in Waller's thesis (which I don't have), but I think it does make sense to use the letter "B" if you decide to simplify R. W. James's equation 1.27 as:
exp(-A*s-B*s^2-C*s^3 ... ).
abbreviating "sin(theta)/lambda" as "s".
As for the question of WHY Debye did this in terms of "s" and not "d", I think it could simply be because he had not yet head of "Bragg's Law", which was published only a few months earlier (Bragg & Bragg Proc. R. Soc. Lon. 1913). Papers didn't spread as quickly then as they do now. Why was the "sin(theta)/lambda formalism" kept? I think because Bragg defined "d" as the spacing between two planes of a crystal lattice, and so using "d" in the "general scattering equation" is arguably inappropriate. Since the B factor connects disordered crystals with diffuse scattering and completely amorphous substances, it probably actually is a good idea to keep it in terms of "r dot S", where "r" is the position vector and "S" is the difference between the directions of the incident and scattered beams.
The fundamental problem with getting rid of all the "scale factors" is because reciprocal space is not actually an inverse-distance space, it is an angle space. That's why all those pesky factors of 2*pi keep popping up everywhere and mixing with factors of two. For example, there are people who follow the gospel of "q", but I've never really understood why.
-James Holton
MAD Scientist
----------
From: Pavel Afonine
This may answer some of your questions or at least give pointers:
----------
From: Phil Evans
Indeed that paper does lay out clearly the various definitions, thank you, but I note that you do explicitly discourage use of B (= 8 pi^2 U), and don't explain why the factor is 8 rather than 2 (ie why it multiplies (d*/2)^2 rather than d*^2). I think James Holton's reminder that the definition dates from 1914 answers my question.
So why do we store B in the PDB files rather than U? :-)
Phil
----------
From: James Holton
I think the PDB decided to store "B" instead of "U" because unless the
B factor was > 80, there would always be a leading "0." in that
column, and that would just be a pitiful waste of two bytes. At the
time the PDB was created, I understand bytes cost about $100 each!
(But that could be a slight exaggeration)
-James Holton
MAD Scientist
----------
From: Frances C. Bernstein
At this point I usually chime in with an explanation of why
the Protein Data Bank made some choice or other in the early
days but on the matter of U vs. B I have not information to
contribute.
I can point out the at that time characters were stored in
display code on a CDC 6600 and display code used 6 bits so
'bytes' at that time were less obese. 6 bits per character
explains, of course, why lower case characters were not
routinely used.
Frances
=====================================================
**** Bernstein + Sons
* * Information Systems Consultants
**** 5 Brewster Lane, Bellport, NY 11713-2803
=====================================================
----------
From: Ian Tickle
Yet Uaniso's are multiplied by 10000 and stored as integers with no problem!
-- Ian
Date: 12 October 2011 14:55
I've been struggling a bit to understand the definition of B-factors, particularly anisotropic Bs, and I think I've finally more-or-less got my head around the various definitions of B, U, beta etc, but one thing puzzles me.
It seems to me that the natural measure of length in reciprocal space is d* = 1/d = 2 sin theta/lambda
but the "conventional" term for B-factor in the structure factor expression is exp(-B s^2) where s = sin theta/lambda = d*/2 ie exp(-B (d*/2)^2)
Why not exp (-B' d*^2) which would seem more sensible? (B' = B/4) Why the factor of 4?
Or should we just get used to U instead?
My guess is that it is a historical accident (or relic), ie that is the definition because that's the way it is
Does anyone understand where this comes from?
Phil
----------
From: Eleanor Dodson
Not sure if this is helpful Phil, but SCALEIT output includes various definitions taken from the Willis and Prior book.
But then there is the problem of converting the amplitude B factors to real space..
I attach my anisotropy notes..
It doesnt address the ? of sensible conventions!!
E
----------
From: Ian Tickle
Hi Phil
My understanding is that when the B factor was devised it was believed
that it wouldn't represent any physical reality and was initially at
least widely regarded as a "garbage dump" for errors. So it made no
difference whether or not it was related to the natural length in
reciprocal space, it was just a number, a "fudge factor" used to fit
the data. Bs^2 is simplest to calculate from theta (which can be
measured directly from the film or diffractometer setting), lambda
(which is fixed of course) and B - particularly if you don't have a
computer! Also a significant point may be that the scattering factors
were tabulated as a function of s=sin(theta)/lambda (but you could
equally well ask why 2sin(theta)/lambda wasn't used there). So it's
more convenient to have B multiplying s^2 since you can simply add B
to the constant part of the Gaussian scattering factor function. Of
course they could have absorbed the extra factor of 2 into lambda
(i.e. use lambda/2 instead of lambda) but maybe no-one thought of
that!
U, the mean square displacement, is the quantity which is directly
related to the physics so if it's realism you're after, use U, not B
(or beta).
Cheers
-- Ian
----------
From: James Holton
The weird scale factor on what we now call "B" comes from Debye's derivation of it (Debye 1914). This derivation is given much more succinctly in Bragg's textbook (since edited by R. W. James (1962) page 22, which I attach). The "kappa" used therein is the magnitude of the incident radiation's wave vector (2*pi/lambda), so there is one factor of 2*pi. The scattering vector "S" is the difference between the unit vectors of the incident and scattered beams, so it is 2*sin(theta) long: another factor of two. Then you can see in equation 1.27 that the second term in the expansion is divided by 2 and squared. The mathematical "reason" for this is similar to the rules for taking a derivative, and equation 1.27 is really just a Taylor expansion in the exponent. You can also think of it as Debye assuming that the atoms move like simple harmonic oscillators.
So, the sum total of all the weird scale factors is to multiply by 2 to convert from d* to sin(theta)/lambda, multiply by 2*pi because the wavevector is in radians, then square it and divide by two because "B" is the second term in a Taylor series. This gives a final scale factor of (2*2*pi)^2/2, or:
B = 8*pi^2*<u_x^2>
where "u" is the atomic displacement vector and u_x is the component of that vector normal to the Bragg plane. Remarkably, movement in the other two directions doesn't change the spot intensity.
You may also note that R.W. James (1962) does not explicitly use "B", but rather M = B*(sin(theta)/lambda)^2. The quantity exp(-2*M) is what is traditionally known as the "Debye factor". The "2" in exp(-2*M) puts it on an intensity scale, just like the polarization factor, Lorentz factor and absorption factor. The splitting of "M" into "B" and "(sin(theta)/lambda)^2" was probably done in Waller's thesis (which I don't have), but I think it does make sense to use the letter "B" if you decide to simplify R. W. James's equation 1.27 as:
exp(-A*s-B*s^2-C*s^3 ... ).
abbreviating "sin(theta)/lambda" as "s".
As for the question of WHY Debye did this in terms of "s" and not "d", I think it could simply be because he had not yet head of "Bragg's Law", which was published only a few months earlier (Bragg & Bragg Proc. R. Soc. Lon. 1913). Papers didn't spread as quickly then as they do now. Why was the "sin(theta)/lambda formalism" kept? I think because Bragg defined "d" as the spacing between two planes of a crystal lattice, and so using "d" in the "general scattering equation" is arguably inappropriate. Since the B factor connects disordered crystals with diffuse scattering and completely amorphous substances, it probably actually is a good idea to keep it in terms of "r dot S", where "r" is the position vector and "S" is the difference between the directions of the incident and scattered beams.
The fundamental problem with getting rid of all the "scale factors" is because reciprocal space is not actually an inverse-distance space, it is an angle space. That's why all those pesky factors of 2*pi keep popping up everywhere and mixing with factors of two. For example, there are people who follow the gospel of "q", but I've never really understood why.
-James Holton
MAD Scientist
----------
From: Pavel Afonine
This may answer some of your questions or at least give pointers:
Grosse-Kunstleve RW, Adams PD:
On the handling of atomic anisotropic displacement parameters.
Journal of Applied Crystallography 2002, 35, 477-480.
Pavel
----------
From: Phil Evans
Indeed that paper does lay out clearly the various definitions, thank you, but I note that you do explicitly discourage use of B (= 8 pi^2 U), and don't explain why the factor is 8 rather than 2 (ie why it multiplies (d*/2)^2 rather than d*^2). I think James Holton's reminder that the definition dates from 1914 answers my question.
So why do we store B in the PDB files rather than U? :-)
Phil
----------
From: James Holton
I think the PDB decided to store "B" instead of "U" because unless the
B factor was > 80, there would always be a leading "0." in that
column, and that would just be a pitiful waste of two bytes. At the
time the PDB was created, I understand bytes cost about $100 each!
(But that could be a slight exaggeration)
-James Holton
MAD Scientist
----------
From: Frances C. Bernstein
At this point I usually chime in with an explanation of why
the Protein Data Bank made some choice or other in the early
days but on the matter of U vs. B I have not information to
contribute.
I can point out the at that time characters were stored in
display code on a CDC 6600 and display code used 6 bits so
'bytes' at that time were less obese. 6 bits per character
explains, of course, why lower case characters were not
routinely used.
Frances
=====================================================
**** Bernstein + Sons
* * Information Systems Consultants
**** 5 Brewster Lane, Bellport, NY 11713-2803
=====================================================
----------
From: Ian Tickle
Yet Uaniso's are multiplied by 10000 and stored as integers with no problem!
-- Ian
No comments:
Post a Comment