Large Structures Represented in mmCIF/PDBx

Consolidated PDBx/mmCIF and PDBML format files are now provided for large molecular structures previously released as 'split' collections of multiple PDB format data files. Entries of this type include structures too large to be represented within a single PDB format data file either because they contain too many atoms, too many polymer chains, or both. A few examples are listed here. A complete list of entries containing large structures can be obtained from this list. A correspondence table describing prior 'split' entries that have been consolidated into single PDBx/mmCIF data files is also available.

A small set of extensions have been used in preparing the consolidated data files in order to represent the full structure in a single file. Specifically, these consolidated files extend the format conventions used in other PDBx/mmCIF data files in the ftp archive in the following ways:

  • Atom serial numbers run from 1 to the number of deposited atoms (with no field-width restrictions)
  • Chain identifiers of up to 4 characters are permitted. The PDB chain identifier corresponds to the "_atom_site.auth_asym_id" data item.
  • Cartesian coordinates may have 3 decimal places of precision (and field width as large as required)
  • Isotropic-B factors and occupancies may have 3 decimal places of precision

Complete large structure entries can be downloaded in mmCIF/PDBx and PDBML/XML formats.

Example Entry Download format Structure type
4v40 (mmCIF) (PDBML) Beta-galactosidase
4v41 (mmCIF) (PDBML) Beta-galactosidase
4v43 (mmCIF) (PDBML) GroEL
4v49 (mmCIF) (PDBML) Ribosome
4v4a (mmCIF) (PDBML) Ribosome
4v4f (mmCIF) (PDBML) TRAP bound to RNA
4v8q (mmCIF) (PDBML) Ribosome
4v5a (mmCIF) (PDBML) Ribosome
4v5v (mmCIF) (PDBML) Spliceosomal U4 SNRNP core
4v4g (mmCIF) (PDBML) Ribosome