|
Pages: [1]
|
 |
|
Author
|
Topic: QT clustering in VMD (Read 15099 times)
|
craig
Thorpe Group
Isostatic
   
Posts: 80
|
Hi all...
I've got another VMD Tcl script that people might find useful -- it uses QT (quality threshhold) clustering to separate a trajectory into clusters. The user specifies a cutoff, and clusters are built such that all pairs of conformers in a given cluster have an RMSD value within the cutoff. I think it runs in polynomial time (between N^2 and N^3 in the number of frames) -- on my PC it could cluster up a 200-frame trajectory of a small molecule in about 10 seconds.
Bug reports are, as always, welcome.
--craig
|
|
|
|
Logged
|
|
|
|
craig
Thorpe Group
Isostatic
   
Posts: 80
|
I've added some new functionality to this script -- if you want to pick out clusters with diameter d, you still type
qtclust d
at the VMD prompt. If you want to pick out maximally-spaced conformers from within those clusters, type
max_spacing [qtclust d]
The algorithm is pretty simple -- it randomly picks one frame out of each cluster as a first guess, then calculates the total RMSD between all pairs of structures in the test set. It then tries to improve the test set by randomly replacing one element with another one from the same cluster; the changed set is kept if the total RMSD increases. The default is to go through 100 cycles of this; you can increase it by using
max_spacing [qtclust d] numcycles
Of course it would be possible to do something much more sophisticated (a Metropolis algorithm, for example) but this is good enough for what I'm doing. If anyone feels like improving it, have at it.
|
|
|
|
Logged
|
|
|
|
craig
Thorpe Group
Isostatic
   
Posts: 80
|
I've added a couple of new features.
init_array now takes an optional argument that specifies the selection text. So, for example, if you want to cluster based only on the backbone coordinates and neglect the sidechains, you can type
init_array backbone
before using the qtclust command. The default behavior is still to cluster using the coordinates of all non-hydrogen atoms. The option uses ordinary VMD selection syntax, so you could do something like
init_array "occupancy > 0.1 and (not nucleic) and within 4.0 of (resname CYS and noh)"
if you're inclined. Also, you might notice that the output looks a little different; the rows of numbers corresponding to the cluster members are followed by a single number in parentheses. This is the frame that is closest to the centroid of the cluster, meaning that it is closer to all of its neighbors than any other member of the cluster is.
Happy clustering.
--craig
|
|
|
|
Logged
|
|
|
|
|
|
Pages: [1]
|
|
|
 |