Tuesday 11 October 2011

Superpose, SSM

From: Matthias Zebisch
Date: 26 September 2011 20:42


Dear CCP4 users,

I am using the ccp4i version 6.2.0 under windows 7. I've come across a problem with superpose.
The outputfile appears to have additional line feeds (see picture) which, however are not seen in the windows notepad.
The structure can also be opened in coot and pymol. However, it is not possible to use it within CCP4, eg. for a subsequent superposition.

Is this problem known to anybody and is there a simple workaround available? I need to compare hell of a lot of relative domain orientations...

I did not have this problem on a second computer with ccp4 6.1.2. When I updated to 6.2.0, the situation was as described above.

Any help will be highly appreciated,

Thanks, Matthias

----------
From: Ed Pozharski


What do you mean by that?  Do you get an error when you use the
superpose output pdb with some other program?



--
"I'd jump in myself, if I weren't so good at whistling."
                              Julian, King of Lemurs

----------
From: Jacob Keller

I vaguely recall notepad doing something wacky with files in certain
cases...why don't you get the excellent text editor NoteTab Light
[sic] (I use it all the time--free and works great), then take a look
at your files and see whether MS notepad altered the files.

JPK
--
*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
*******************************************

----------
From: Matthias Zebisch

Hi again,

Thanks for your quick replies but I think I made myself not clear. here is what I'm doing:

1) superpose proteinA.pdb onto proteinB.pdb  : works, but gives out proteinA_lsq1.pdb with extra empty lines (not the anisou lines ;o) )

2) superpose proteinA_lsq1.pdb onto proteinC.pdb : doesnt work because proteinA_lsq1.pdb cannot be read

Any ideas?
Even if there is some compatibility issue between CCP4 and windows, I guess superpose should be able to read its own files, shouldnt it?

Thanks,

Matthias

----------
From: Edward A. Berry


What application ARE the extra lines seen in?

Sounds like a problem with different newline conventions-
<CR> vs <LF> vs <CR><LF> -
although I shouldn't have thought extra carriage returns and linefeeds
would confuse a program reading a PDB.

Maybe the latest update uses the windows <CR><LF> on output
but still unix <LF> for input

Anyway, to help the developers debug, it would be good to provide
a hex dump of a portion of the pdb file. If you don't have a windows
hex editor you can use debug (I assume it still exists in windows 7)

Open a dos shell and cd to the directory where the peculiar
file is, and

debug lsq1.pdb
 d 100 200
that is dump from the beginning of the file (which is loaded
starting hex 100) to hex 200. If there is a big header you
may want to skip down to the ATOM records,  d 1000 1100
or such.
this gives hex dump on left and ascii on the right like below.
Lines are separated by two dots in the ascii which correspond
to 0D 0A in the hex dump-

-d 100 200
0B05:0100  57 79 61 72 74 69 74 65-3A 20 42 75 72 6E 73 20   Wyartite: Burns
0B05:0110  61 6E 64 20 46 69 6E 63-68 2C 20 41 6D 20 4D 69   and Finch, Am Mi
0B05:0120  6E 20 38 34 20 28 31 39-39 39 29 20 31 34 35 36   n 84 (1999) 1456
0B05:0130  2D 31 34 36 30 0D 0A 57-79 61 72 74 69 74 65 3A   -1460..Wyartite:
0B05:0140  20 43 72 79 73 74 61 6C-6C 6F 67 72 61 70 68 69    Crystallographi
0B05:0150  63 20 65 76 69 64 65 6E-63 65 20 66 6F 72 20 74   c evidence for t
0B05:0160  68 65 20 66 69 72 73 74-20 70 65 6E 74 61 76 61   he first pentava
0B05:0170  6C 65 6E 74 2D 75 72 61-6E 69 75 6D 0D 0A 20 6D   lent-uranium.. m
0B05:0180  69 6E 65 72 61 6C 0D 0A-31 31 2E 32 37 30 36 20   ineral..11.2706
0B05:0190  37 2E 31 30 35 35 20 32-30 2E 38 30 37 20 39 30   7.1055 20.807 90
0B05:01A0  20 39 30 20 39 30 20 50-32 32 32 0D 0A 55 31 0D    90 90 P222..U1.
0B05:01B0  0A 2E 36 31 31 20 2E 33-31 30 20 2E 31 35 34 20   ..611 .310 .154
0B05:01C0  33 0D 0A 43 0D 0A 2E 35-31 31 20 2E 34 31 30 20   3..C...511 .410
0B05:01D0  2E 31 35 34 20 32 0D 0A-61 6E 69 6F 6E 73 0D 0A   .154 2..anions..
0B05:01E0  4F 31 0D 0A 2E 35 31 31-20 2E 33 31 30 20 2E 31   O1...511 .310 .1
0B05:01F0  35 34 0D 0A 65 6E 64 F0-97 01 75 03 E8 CE E0 3C   54..end...u....<
0B05:0200  2E

----------
From: James Stroud

I think something in your workflow is inserting dos line feeds (\n\r or \r\n, I can't remember which).

If I have guessed correctly, you want to remove those "\r"s before proceeding (or never let them get in there in the first place).

You claim to open it with MS something, which would insert dos line feeds as part of Operation Vendor Lock. Did you happen to save it, perhaps by habit? That would do the trick.  It might even do something insidious and insert those linefeeds without your purposefully saving the document.

Your best bet to fix the file after corruption is vim (used to be that "crystallographers" could use real text editors).

The command in vim is:

 :%s/\r//g

You might find some third party utility that fixes linefeeds for $30.00 somewhere, if vim is too "retro".

Otherwise, you may want to start over, skip checking it out in MS something, and go straight to superpose.

James

----------
From: Robbie Joosten

One would assume that Windows software would read DOS/Windows type text files...
Open the file in Wordpad. Unlike Notepad, it is able to work with Windows and Unix type text files. If you edit something and save the file, it will be in Windows style. If Superpose stops on that, it should really be updated. I'm sure that there are Windows versions op the programs Unix2dos and Dos2unix which were the programs to use to convert one type to the other. You can also use Word to search and replace the linefeeds.

Good luck with this very retro problem.
Cheers,
Robbi
e



----------
From: David Waterman

Windows users might like to download the free editor Notepad++ (http://notepad-plus-plus.org/) as a replacement for the dire editor that comes as standard with the OS. This handles end of line conversions, does hex dumps, syntax highlighting etc. You *can* make a decent OS out of Windows - once you obtain enough third party components.

Cheers

-- David

----------
From: Pierre Rizkallah


Dear BB Readers,

I have used CCP4 on Windows exclusively for nearly 10 years now. The text editor of choice for PDB files coming out of CCP4 has always been WORDPAD, which comes with Windows, no need to download. NOTEPAD causes me great difficulty with PDB files, probably because of my age, but let's not get hung up on that. I have never been interested enough in chasing up ways of using NOTEPAD properly.

In the last couple of months, CCP4 started outputting PDB files that appeared in WORDPAD as one continuous line, i.e. no line feed. This was an INTERMITTENT behaviour. I could not reproduce it to order. These files are readable by NOTEPAD, despite my dislike of this editor. I have not changed anything in my set up or in the CCP4 or Coot installation. So I could blame it on some system updates in WINDOWS that are invisible to me, but then again it could be a fault of something else, perhaps like changing the <CR><LF> combination already mentioned. The files are usually readable in COOT, which reads anything coming out of CCP4, but again output from COOT is occasionally unreadable in CCP4. NOTEPAD readable files are usually OK. Unfortunately, I don't have a recipe for correct action.

Having said all that, at the moment output PDB files from CCP4 and COOT appear to be all well behaved in terms of readability, both in the Windows editors and in CCP4. I would like to think this problem has disappeared just as it had appeared suddenly. I shall continue to look out for symptoms of trouble.

Pierre
*******************************************************
Dr Pierre Rizkallah, Senior Lecturer in Structural Biology,
Wales Heart Research Institute, Heath Campus, Cardiff CF14 4XN



----------
From: Jacob Keller


I want to pitch again for NoteTab--it is really useful, can do almost
everything a good text editor should. I particularly like to use
regexes, and you can do that with NoteTab, along with a ton of other
things. I am not affiliated in any way, and the program is totally
free and does not install toolbars or anything. Try it! Maybe it
should be part of ccp4!

Jacob


No comments:

Post a Comment