Wednesday, 9 May 2012

How to reduce R factor

From: Dipankar Manna
Date: 14 March 2012 09:26


Dear Crystallographers,

 

Can anybody guide me how to reduce R-factor, means which are the basic parameters I have to look for to reduce the R-factor during refinement. I am newly learning the refinement. After running molrep R-factor is around 53% (100% identity), after rigid body refinement its showing around 49% and after restrained refinement its showing around 47%. Highest resolution is 2.5A.

 

Regards

 

Dipankar



----------
From: Tim Gruene

Dear Dipankar,

if you refine your model straight after molecular replacement you risk
to further strengthen model bias which could result in hovering out
features in your data which otherwise help you improve your model.

Look at the model and the map after rigid body refinement with the model
building program of your choice and improve the model as much as you can
before you run any further refinement. If you do not see any  features
in the map deviating from the model chances are high that your MR
solution is incorrect.

Best wishes,
Tim
- --
- --


----------
From: Harry Powell


Hi Dipankar

If you've been reading the ccp4bb for more than a couple of weeks, you should have realised that reducing your R-factor is *not* the goal of refinement - having a low R-factor is one of the consequences of having built your model well and of having performed a good refinement. Don't try to reduce all the thousands of observations, days (weeks, months, years...) of work and thousands of pounds/dollars/Euros,rupees to a single number.

If you really don't know what you should be doing, and this is your first time, you should do the following, all of which will give you much more useful information than you can possibly get from ccp4bb.

(1) Find a copy of David Blow's book "Outline of Crystallography for Biologists" and read it, especially chapter 12 (Structural Refinement). This will take no more than a couple of days if you are reasonably happy with what you are doing.

(2) Find a copy of Bernhard Rupp's book "Biomolecular Crystallography" and read the chapter on "Model building and Refinement" (also, coincidentally, chapter 12). Keep the book next to you while you are learning protein crystallography. 

(3) Actually, this should be the *first* thing you should do. Talk to experienced crystallographers in your lab. If they are any good at all, they will explain to you what you should be doing and why.

(4) Go on a course - Aurigene should find it well worth the investment in paying for their employees to attend one of the various intensive protein crystallography courses that take place around the world. At these courses, you get the chance to meet and discuss issues with global leaders in the field - and learn a huge amount.

(5) (Worst option) Read past posts on this on the ccp4bb - they should only  make you realise that you should have done (1) to (4) above anyway.

HTH, 

Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre, Hills Road, Cambridge, CB2 0QH




----------
From: <Herman.Schreuder


Dear Dipankar,
 
Molrep Rfactors around 50% with a model with 100% identity means that something went wrong and you did not find the solution. To find the problem, I would proceed as follows:
1) check the processing of the data and the space group: Are the statistics of the processing ok? Did you let the processing software find the space group, or did you specify it? The true space group maybe different from what you think. You may also process in P1 and let pointless figure out the space group.
2) check that you used the correct search model. It maybe trivial but if you mixed up pdb files, you will never find a solution.
3) run Molrep of Phaser with the option to test all possible spacegroups for your crystal system. During processing it is not always possible to reliable distuinguish e.g. between P212121, P21212, P2221 etc. The only way to find out is to systematically try all possibilities. All molecular replacement programs do have an option for this.
4) It may also be to you searched for too many or too few molecules. Do separate searches for 1 to as many molecules as fit in the asymmetric unit. It is not common but crystals exist with only 30% solvent or as much as 70% solvent.
5) Finally, try to find an experienced crystallographer to help you. Again, your problem is not with the refinement, but with the molecular replacement.
 
Good luck!
Herma




----------
From: <Herman.Schreuder


Dear Dipankar,
It just occurred to me that your high Rfactors may also be due to a large conformational change of your protein. In that case you have to split your search model in separate pdb files for the separate domains and rund Molrep with these separate domains.
 
Best,
Herman




----------
From: Ed Pozharski


On Wed, 2012-03-14 at 09:26 +0000, Dipankar Manna wrote:
> After running molrep R-factor is around 53% (100% identity), after
> rigid body refinement its showing around 49% and after restrained
> refinement its showing around 47%.

Sounds like you didn't get a solution.  With 100% identity MR in most
cases works like a charm, so there must be something wrong with

1) Data processing - wrong spacegroup?  Try processing your data in P1.
If MR works after that, start working up to the higher symmetry.  If you
need specific advice from the bb, provide details on unit cell
parameters, space group, R-merge, chi-square etc.  Best of all, post
your log files.

2) Model - without further information, it's impossible to say what the
problem is.  Describe to the bb your protein - molecular weight, how
many domains, etc.  Sequence identity is not the key, it's the rmsd
between your model and your structure.  There are examples in the
literature when 100% identical model does not work even if broken into
domains, although it's very likely that your problems lie elsewhere.

3) Molecular replacement - sometimes the right model is rejected because
you get some conformational changes and therefore clashes.  R~53% after
MR usually means that you did not find a solution.  Stick in a
completely wrong model of the same size to get an idea of what to expect
when MR fails.

4) Refinement - least likely at this point, but check for the twinning.
Most of all, see if the electron density makes sense - a good test is to
remove part of the model and see if it shows up in the difference map.

Good luck,

Ed.

--
Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
                                               Julian, King of Lemurs

----------
From: Roger Rowlett


Dipankar,

An MR R-factor of 53% is close to what you get with a random, incorrect solution. Even for challenging MR cases, your MR R-factor should normally be under 50% before rigid-body refinement of the MR solution. As others have mentioned, you should not proceed directly to refinement unless you know your MR solution is sensible and you have fixed the obvious problems otherwise you may lock in some model bias. There are a few sanity checks you should perform before proceeding:
  1. Inspect the model in Coot or Pymol (or whatever), turn on symmetry molecules, and inspect molecule packing in the lattice. If you don't get nicely packed molecules with reasonable intermolecular contacts (no major clashes or interpenetrating molecules, no "lonely" molecules) and obvious solvent channels, the space group is likely wrong. Run Phaser with the option to look at all alternative space groups.
  2. Run a cell content analysis in Phaser. (You should do this first.) This feature uses the Matthews probability calculator to estimate the number of search models in the asymmetric unit. If you have too many/too few models in the ASU, you won't get a good solution. Inspecting packing of the lattice may alert you to having too many/too few protein chains in the ASU.
  3. Inspect your electron density maps. If it is difficult to trace the main chain or see clear side chain density, it is not likely you have a solution. However, some incorrect solutions can sometimes give quasi-sensible-looking density. If your solution is decent, you should be able to see non-protein features in the difference maps, e.g. metal ions should stand out in metalloenzyme structures.

It is possible that your search model contains features (N- and C-terminal secondary structures or loops) that are disordered in the crystal. Including these in the search model can cause problems with clashes and poor phasing. Again, inspecting the electron density and/or clashes in the MR solution may alert you to this issue. Modifying your search model appropriately may help. Or not.

If you have reason to believe your search model is a good one, Phaser or Open-EPMR has never failed me, even with search models with just under 30% identity or high copy numbers per ASU.

Cheers,

_______________________________________
Roger S. Rowlett

No comments:

Post a Comment