INSTALL AND BUILD INSTRUCTIONS

  1. Download the library.
  2. Install the pdbx module, either by adding the directory containing the pdbx module to your PYTHONPATH or by moving the pdbx directory (and subdirectories) to a location already in your PYTHONPATH (in Python IDLE, import sys and check the contents of sys.path). See here for more about adding a module to the Python search path (e.g., in BASH:).
    		  mkdir -p source/python/modules
    		  mv pdbx source/python/modules
    		  PYTHONPATH=$PYTHONPATH:source/python/modules
    		  export PYTHONPATH
    		
  3. Test the installation by executing python PdbxReaderTests.py and python PdbxReadWriteTests.py in /path/to/pdbx/reader, or python PdbxWriterTests.py in /path/to/pdbx/writer.
  4. If you do not receive any 'module not found' errors and the tests run, you should be able to import from the pdbx module anywhere.

PYTHON EXAMPLES

Connections.py
Uses the PDBX library to interface with Chimera. Shows how to retrieve and iterate over the struct_conn category, which delineates connections in a molecule, and locate connections of interest (in this case, covalent bonds) for Chimera to emphasize and animate.
Structures.py
Uses the PDBX library to interface with Chimera. Shows how to retrieve and iterate over the struct_site_gen category, which delineates members of structurally relevant sites in a molecule, and locate all structurally relevant sites for Chimera to emphasize and animate.
Connections3.py
This example shows one way of using the information about a partner atom in a connection, detailed in the the struct_conn category, to identify the atom in the atom_site category, and, in this case, to determine the (x,y,z) Cartesian coordinates of said atom. In this case, we look for partner atoms involved in covalent bonds and report their (x,y,z) coordinates.
Connections2.py
Uses the PDBX library to interface with Chimera. Shows how to find connections of certain types that involve certain entities by retrieving and iterating over the struct_conn category, which delineates connections in a molecule, and using the struct_asym and entity categories to determine the entity types involved in each connection. In this case, polymer-polymer covalent bonds are sought for Chimera to display and animate.
FASTA.py
This example shows how the (sequence) information contained in a CIF file can be readily accessed and transformed into another format. This particular example implements a FASTA converter, which reads the monomer sequences in the entity_poly_seq category and translates them into the single-letter FASTA format.
Assemblies.py
A more involved and extensive example that uses the PDBX library to generate a CIF file for each biological assembly listed in the pdbx_struct_assembly category of a CIF file. This example synthesizes information located in the pdbx_struct_assembly_gen, pdbx_struct_oper_list, and atom_site categories to accomplish this task.

Basic I/O Operations

Reading and writing are handled by the PdbxReader (in pdbx.reader.PdbxReader) and PdbxWriter (in pdbx.writer.PdbxWriter) classes, respectively.

Using PdbxReader

imports: PdbxReader from PdbxReader, * from PdbxContainers
  1. Open() a CIF file and store the file handle
    ifh = open("/path/to/file.cif")
  2. Initialize a PdbxReader object with the input file handle
    pRd = PdbxReader(ifh)
  3. Initialize a list to be propagated with DataContainer (and/or DefinitionContainer) objects (of the DataContainer class, which inherits from ContainerBase) parsed from the CIF file, where data blocks map to DataContainer objects
    data = []
  4. Call the read(self, containerList) method with your list
    pRd.read(data)
  5. Your list is now propagated with one or more DataContainer objects, which represent data blocks. To get the first data block, just use list notation:
    block = data[0]
  6. To retrieve a category object, use the getObj(self, name) method
    struct_conn = block.getObj("struct_conn")
  7. To retrieve a value stored in a category table, e.g., the connection type of the first linkage described in the struct_conn category table, use the getValue(self, attributeName=None, rowIndex=None) method
    connType = struct_conn.getValue("conn_type_id", 0)
  8. See below for other methods to handle blocks, and, subsequently, the contents of the category objects they contain.

Using PdbxWriter

imports: PdbxWriter from PdbxWriter, * from PdbxContainers
  1. Open() a file for writing and store the file handle
    ofh = open("path/to/out.cif", "w")
  2. Initialize a PdbxWriter object with the output file handle
    pWt = PdbxWriter(ofh)
  3. The two major PdbxWriter write methods are write(self, containerList), which takes a list of containers, data and/or definition, and writeContainer(self, container), which takes a single data or definition container.
  4. Now you can declare one or more DataContainer/DefintionContainer objects and write them.

Containers and Methods

All of the containers are accessible through pdbx.reader.PdbxContainers. The DataContainer, to which data blocks map, and DefinitionContainer classes derive from ContainerBase, which maintains an internal dictionary of DataCategory (derived from DataCategoryBase) objects, to which categories map. The following are some methods of interest for these three major container objects, viz., DefinitionContainer and DataContainer, derived from ContainerBase, and DataCategory, derived from DataCategoryBase.

DefinitionContainer/DataContainer

  • exists(self, name) - returns a bool indicating whether or not the DataCategory object named name exists in this container
  • getObj(self, name) - returns the DataCategory object named name, or None if it doesn't exist
  • getObjNameList(self) - returns the list of category names within this container
  • printIt(self, fh=sys.stdout, type="brief") - prints out the contents of the container

DataCategory

  • __getitem__(self, x) - special method, category[x] returns the row specified by the integer x in category
  • get(self) - returns 3-tuple consisting of (categoryName, attributeNameList, rowList)
  • getRowList(self) - returns a list of all the rows in the category table
  • getRowCount(self) - returns the number of rows in the category table
  • getRow(self, index) - attempts to fetch the row at index index and returns an empty list if it fails
  • getAttributeList(self) - returns a list of attribute/data item names
  • getAttributeCount(self) - returns the number of attributes/columns in the category table
  • getAttributeIndex(self, attributeName) - returns the index of the attribute specified by attributeName or -1 if not found
  • hasAttribute(self, attributeName) - returns a bool indicating whether or not the category has the attribute attributeName
  • getIndex(self, attributeName) - same as getAttributeIndex(self, attributeName)
  • getValue(self, attributeName=None, rowIndex=None) - returns the value of the attribute attributeName at row index rowIndex