Wednesday 7 December 2011

Combining big MTZs

From: Jacob Keller
Date: 18 November 2011 15:36


Dear Crystallographers,

I am getting an error when I try to merge two mtz's from mosflm, one
with 180 and one with 360 frames, each from different but similar
crystals--see below. I can't imagine this really exceeds the max
number of records, so what am I doing wrong? Additionally but related,
what is the optimal procedure in CCP4 for combining data from two
similar crystals?

JPK



#CCP4I VERSION CCP4Interface 2.1.0
#CCP4I SCRIPT LOG sortmtz
#CCP4I DATE 18 Nov 2011  09:26:28
#CCP4I USER 'UNKNOWN'
#CCP4I PROJECT NatVSulP
#CCP4I JOB_ID 26
#CCP4I SCRATCH C:/Ccp4Temp
#CCP4I HOSTNAME chloe
#CCP4I PID 5396

<html> <!-- CCP4 HTML LOGFILE -->
<hr>

<a name="smsortmtz"><h2>SORTMTZ</h2></a>
<pre>

 ###############################################################
 ###############################################################
 ###############################################################
 ### CCP4 6.2: SORTMTZ                  version 6.2 : 06/09/05##
 ###############################################################
 User: Jacob  Run date: 18/11/2011 Run time: 09:26:28


 Please reference: Collaborative Computational Project, Number 4. 1994.
 "The CCP4 Suite: Programs for Protein Crystallography". Acta Cryst.
D50, 760-763.
 as well as any specific reference in the program write-up.

</pre>

<a name="tocSORTMTZ"><h2>Contents</h2></a>
<ul>
<li><a href="#commandSORTMTZ">Command Input</a>
<li><a href="#inputSORTMTZ">Input File Details</a>
<li><a href="#outputSORTMTZ">Output File Details</a>
<li><a href="#outSORTMTZ">Header Information for Output MTZ File</a>
</ul>
<hr>

<a name="commandSORTMTZ"><h3>Command Input</h3></a>
<a href="C:\CCP4-Packages\ccp4-6.2.0\html/sortmtz.html#ascend">ASCEND/DESCEND</a>
<a href="C:\CCP4-Packages\ccp4-6.2.0\html/sortmtz.html#sort_keys">SORT KEYS</a>
<pre>
 Data line--- ASCEND
 Data line--- H K L M/ISYM BATCH
 Data line--- "C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_jpk_raw.mtz"
</pre>
<hr>

<a name="inputSORTMTZ"><h3>Input File Details</h3></a>
<pre>

 OPENED INPUT MTZ FILE
 Logical Name: C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_jpk_raw.mtz
 Filename: C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_jpk_raw.mtz

 * Title:

 Untitled

 * Base dataset:

       0 HKL_base
         HKL_base
         HKL_base

 * Number of Datasets = 1

 * Dataset ID, project/crystal/dataset names, cell dimensions, wavelength:

       1 New
         New
         New
            82.0600   82.0600  159.2500   90.0000   90.0000   90.0000
            0.97856

 * Number of Columns = 18

 * Number of Reflections = 1526614

 * Missing value set to NaN in input mtz file

 * Number of Batches = 360

 * Column Labels :

 H K L M/ISYM BATCH I SIGI IPR SIGIPR FRACTIONCALC XDET YDET ROT WIDTH
LP MPART FLAG BGPKRATIOS

 * Column Types :

 H H H Y B J Q J Q R R R R R R I I R

 * Associated datasets :

 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 * Cell Dimensions : (obsolete - refer to dataset cell dimensions above)

  82.0600   82.0600  159.2500   90.0000   90.0000   90.0000

 *  Resolution Range :

   0.00030    0.15998     (     58.026 -      2.500 A )

 * Sort Order :

     0     0     0     0     0

 * Space group = 'P43212' (number     96)


 Spacegroup information obtained from library file:
 Logical Name: SYMINFO   Filename:
C:\CCP4-Packages\ccp4-6.2.0\lib\data\syminfo.lib

       5 sort keys, in columns    1   2   3   4   5
</pre>
<hr>

<a name="outputSORTMTZ"><h3>Output File Details</h3></a>
<pre>
   1526614 records read from file    1
 Data line--- "C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_chs_raw_batched_up_to_401.mtz"
</pre>
<hr>

<a name="inputSORTMTZ"><h3>Input File Details</h3></a>
<pre>

 OPENED INPUT MTZ FILE
 Logical Name: C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_chs_raw_batched_up_to_401.mtz
 Filename: C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_chs_raw_batched_up_to_401.mtz

 * Title:

 [No title given]

 * Base dataset:

       0 HKL_base
         HKL_base
         HKL_base

 * Number of Datasets = 1

 * Dataset ID, project/crystal/dataset names, cell dimensions, wavelength:

       1 New
         New
         New
            82.3400   82.3400  161.4500   90.0000   90.0000   90.0000
            0.97940

 * Number of Columns = 18

 * Number of Reflections = 536433

 * Missing value set to NaN in input mtz file

 * Number of Batches = 180

 * Column Labels :

 H K L M/ISYM BATCH I SIGI IPR SIGIPR FRACTIONCALC XDET YDET ROT WIDTH
LP MPART FLAG BGPKRATIOS

 * Column Types :

 H H H Y B J Q J Q R R R R R R I I R

 * Associated datasets :

 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

 * Cell Dimensions : (obsolete - refer to dataset cell dimensions above)

  82.3400   82.3400  161.4500   90.0000   90.0000   90.0000

 *  Resolution Range :

   0.00015    0.16000     (     82.340 -      2.500 A )

 * Sort Order :

     1     2     3     4     5

 * Space group = 'P43212' (number     96)

 Too many records
 SORTMTZ failed to release record to sort procedure, status =         1

 SORTMTZ:  Sorting failed
Times: User:       0.0s System:    0.0s Elapsed:     0:05
</pre>
</html>
***************************************************************************
* Information from CCP4Interface script
***************************************************************************
The program run with command: sortmtz HKLOUT
"C:/Users/Jacob/Desktop/Dallos_Lab/Analysis/NatVSulP/lo_res_natvsulp_comb_raw.mtz"
has failed with error message
 SORTMTZ:  Sorting failed
***************************************************************************


#CCP4I TERMINATION STATUS 0 " SORTMTZ:  Sorting failed"
#CCP4I TERMINATION TIME 18 Nov 2011  09:26:33
#CCP4I MESSAGE Task failed



--
*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
*******************************************

----------
From: Phil Evans


There's a limit set in CCP4/lib/src/sorting_main.f

C     NMAX_MEM increased from 16000000 to 32000000
     PARAMETER (NMAX_MEM = 32000000)

which you could change and recompile (:-))

You can combine unmerged files in Pointless, which is usually the best method, but that reads everything into memory so you may well hit system limits unless you have a 64-bit system with lots of memory  (and a 64-bit build)

Phil

----------
From: Jacob Keller


Okay, it seems there is utility to Pointless--problem solved! Maybe
developers should consider upping the limit, though?

Thanks very much,

Jacob

----------
From: Phil Evans


Pointless should supercede sortmtz, at least for most purposes, but I agree it should be increased

Actually it's inconsistent in the source code, elsewhere there is

     PARAMETER (NMAX_MEM = 16000000)

but that may not matter

Phil


No comments:

Post a Comment