Using PAIREF¶
Getting started¶
Open the terminal (GNU/Linux, macOS) or CCP4 console (Windows) or Phenix Command Prompt (Windows) and go to the folder where your structure model and diffraction data are saved. For example, let’s assume that your structure model nuclease_model.pdb has been refined using data file data_2A.mtz at 2 Å resolution and all the files are located in a folder /home/test/project/nuclease/dataset7/files/
. To change the current working directory, run a command:
cd /home/test/project/nuclease/dataset7/files/
However, you have also prepared full-resolution merged diffraction data file data_full_resolution.mtz and unmerged diffraction data file XDS_ASCII_full.HKL (both at 1.5 Å).
Note
The files data_full_resolution.mtz and data_2A.mtz should contain consistent free reflection sets.
The usage of unmerged data is not obligatory, however, it is strongly recommended as they are required for the CC* calculation. Various file formats are supported (.HKL from XDS, .mtz, .sca).
Now, you would like to perform the paired refinement protocol and use step-by-step following high resolution limits: 1.9 Å, 1.8 Å, 1.7 Å, 1.6 Å, and 1.5 Å. To execute these calculations, run a command:
cctbx.python -m pairef --HKLIN data_full_resolution.mtz --XYZIN nuclease_model.pdb -u XDS_ASCII_full.HKL -i 2 -r 1.9,1.8,1.7,1.6,1.5 -p nuclease
Then a new folder pairef_nuclease is created in the folder where the command has been executed and all the log files, new structure models, etc., will be saved there. Open a file PAIREF_nuclease.html in a web browser to see the current progress, results, plots, and statistics.
PAIREF will refine the input structure model (default 10 cycles in REFMAC5) against data up to 1.9 Å. Then it will calculate statistics relating to the refined model and plot graphs. After that, the refined model will be further refined against data up to 1.8 Å and its relating statistics will be computed. This will be also performed using the remaining high resolution diffraction limits 1.7 Å, 1.6 Å, and 1.5 Å. In the end, merging statisting will be calculated.
Graphical interface¶
PAIREF provides also a graphical inteface - see page Graphical interface.
Detailed specification of refinement parameters¶
To obtain meaningful results, the refinement setting during the paired refinement protocol should be very similar to the setting that has been used in previous refinement steps. PAIREF provides many option to run all the calculations under full control of the user.
Refinement software¶
Options -R
or --refmac
specify refinement in REFMAC5 (default), whereas options -P
or --phenix
refinement in phenix.refine.
Using external CIF file (LIBIN)¶
If a CIF file with external restrains has been used in previous refinement steps, it should be specified to be used also in the paired refinement. This can be specified with an option --LIBIN some_restrains.cif
(assuming that the file is saved in the folder where PAIREF is executed.
Number of refinement cycles¶
The number of refinement cycles that is be performed in every resolution step can be controlled using an option --ncyc value
, e.g. --ncyc 20
. The default setting is 10 cycles in REFMAC5 or 3 macro cycles in phenix.refine.
Special options for REFMAC5¶
Weighting term¶
The weight of the X-ray term for REFMAC5 can be specified using option -w value
, e.g. -w 0.5
.
TLS refinement¶
It is possible to perform a TLS refinement in PAIREF before a restrained refinement. An input TLS file is speciffied by an option --TLSIN
. A new TLS output file generated during a refinement is then used in the next refinement run (using data up to a higher resolution) as the TLS input file. To avoid this default behaviour and use the same in all the refinement runs, use an option --TLSIN-keep
. A number of TLS refinement cycles can be set (e.g. for 5 cycles: --TLS-ncyc 5
), 10 cycles are performed by default.
REFMAC5 parameters (Com file)¶
A Com file (command file) is a text file describing refinement parameters for REFMAC5. Important parameters are e.g. a weight matrix or a number of refinement cycles ncyc.
To obtain a command file of particular refinement job in CCP4, select the last refinement job and press ReRun Job.. . In a newly opened dialog, do not press Run Now but select Run & View Com File (details are described in the CCP4 documentation). Then a new dialog is opened – the text at the bottom is the content of the command file. Select it, press Ctrl+C to copy it, paste it in a text editor and save it as e. g. setting.com in the folder where the diffraction data and model are placed.
The command file is specified with an option -c setting.com
or --comfile setting.com
where setting.com is the file containing parameters for REFMAC5.
Note
Even thought the weight of the X-ray term or the number of refinement cycles are set in the Com file (REFMAC5 keywords WEIGht MATRix
and NCYC
), the values specified by the options --weight
and --ncyc
have the higher priority.
Constant FFT-grid¶
To keep the highest resolution FFT-grid in all the calculations, run PAIREF with an option --constant-grid
. The grid is then controlled by a REFMAC5 keyword SHANnon_factor.
Special option for phenix.refine¶
phenix.refine parameters¶
Refinement parameters for phenix.refine can be defined in a text file. Here, e.g. target weights or TLS groups can be set. See documentation of the program for more information. For example, it can contain a following content:
refinement.refine.strategy=tls+individual_sites+individual_adp
refinement.refine.adp.tls="chain A"
refinement.refine.adp.tls="chain B"
refinement.main.number_of_macro_cycles=4
refinement.target_weights.wxc_scale=3
refinement.target_weights.wxu_scale=5
refinement.simulated_annealing.start_temperature=5000
This file can be specified with an option -d phenix_params.def
or --def phenix_params.def
where phenix_params.def is a file name.
Modification of input structure model¶
The input structure model can be modified and refined at the starting resolution before the paired refinement. These options should be used if the structure has been refined in another software or another version than it is currently used, or the bias of previous free reflection selection is present. The number of refinement cycles at the starting resolution is be controlled by the option --prerefinement-ncyc
(20 cycles by default).
Possible modifications of the structure model:
- reset ADPs their mean value:
--prerefinement-reset-bfactor
, - add a value to the ADPs:
--prerefinement-add-to-bfactor ADD_TO_BFACTOR
, - set ADPs to a value:
--prerefinement-set-bfactor
, - perturb the atomic coordinates by an average of a value (0.25 Å by default):
--prerefinement-shake-sites [SHAKE_SITES]
, - no modification
--prerefinement-no-modification
.
Summary of program options¶
$ ccp4-python -m pairef -h
usage: ccp4-python -m pairef [--GUI] --XYZIN XYZIN --HKLIN HKLIN
[-u HKLIN_UNMERGED] [--LIBIN LIBIN]
[--TLSIN TLSIN] [-c COMIN] [-d DEFIN] [-R | -P]
[-p PROJECT] [-r RES_SHELLS] [-n N_SHELLS]
[-s STEP] [-i RES_INIT] [-f FLAG] [-w WEIGHT]
[--ncyc NCYC] [--constant-grid] [--complete]
[--TLS-ncyc TLS_NCYC] [--TLSIN-keep]
[--open-browser] [-h]
[--prerefinement-ncyc PREREFINEMENT_NCYC]
[--prerefinement-reset-bfactor]
[--prerefinement-add-to-bfactor ADD_TO_BFACTOR]
[--prerefinement-set-bfactor SET_BFACTOR]
[--prerefinement-shake-sites [SHAKE_SITES]]
[--prerefinement-no-modification]
Automatic PAIRed REFinement protocol
optional arguments specifying input files:
--GUI, --gui Start graphical user interface (usually requires to be
executed as ccp4-python, not as cctbx.python)
--XYZIN XYZIN, --xyzin XYZIN
PDB or mmCIF file with current structure model
--HKLIN HKLIN, --hklin HKLIN
MTZ file with processed diffraction data
-u HKLIN_UNMERGED, --unmerged HKLIN_UNMERGED
unmerged processed diffraction data file (e.g.
XDS_ASCII.HKL or data_unmerged.mtz)
--LIBIN LIBIN, --libin LIBIN
CIF file geometric restraints
--TLSIN TLSIN, --tlsin TLSIN
input TLS file (only for REFMAC5)
-c COMIN, --comfile COMIN
configuration Com file with keywords for REFMAC5
-d DEFIN, --def DEFIN
configuration def file with keywords for phenix.refine
-R, --refmac Use REFMAC5 (default)
-P, --phenix Use phenix.refine
other optional arguments:
-p PROJECT, --project PROJECT
project name
-r RES_SHELLS explicit definition of high resolution shells - values
must be divided using commas without any spaces and
written in decreasing order, e.g. 2.1,2.0,1.9
-n N_SHELLS number of high resolution shells to be added step by
step. Using this argument, setting of argument -s is
required.
-s STEP, --step STEP width of the added high resolution shells (in
angstrom). Using this argument, setting of argument -n
is required.
-i RES_INIT initial high-resolution diffraction limit (in
angstrom) - if it is not necessary, do not use this
option, the script should find resolution
automatically in PDB or mmCIF file
-f FLAG, --flag FLAG definition which FreeRflag set will be excluded during
refinement (set 0 default)
-w WEIGHT, --weight WEIGHT
manual definition of weighting term (only for REFMAC5)
--ncyc NCYC number of refinement cycles that will be performed in
every resolution step
--constant-grid keep the same FFT grid through the whole paired
refinement. (only for REFMAC5)
--complete perform complete cross-validation (use all available
free reflection sets)
--TLS-ncyc TLS_NCYC number of cycles of TLS refinement (10 cycles by
default, only for REFMAC5)
--TLSIN-keep keep using the same TLS input file in all the
refinement runs (only for REFMAC5)
--open-browser open web browser to show results (requires to be
executed as ccp4-python, not as cctbx.python)
-h, --help show this help message and exit
optional arguments specifying structure model modification:
--prerefinement-ncyc PREREFINEMENT_NCYC
number of refinement cycles to be performed as pre-
refinement of the input structure model before paired
refinement (the initial high resolution limit is
used). Pre-refinement is performed by default in case
of the complete cross-validation protocol. Other
related options are --prerefinement-reset-bfactor,
--prerefinement-add-to-bfactor, --prerefinement-set-
bfactor, --prerefinement-shake-sites, and
--prerefinement-no-modification. These options can be
useful when the structure has been refined in another
version of REFMAC5 or phenix.refine than it is
currently used or when you want to reset the impact of
used free reflections.
--prerefinement-reset-bfactor
reset atomic B-factors of the input structure model to
the mean value. This is done by default in the case of
the completecross-validation protocol.
--prerefinement-add-to-bfactor ADD_TO_BFACTOR
add the given value to B-factors of the input
structure model
--prerefinement-set-bfactor SET_BFACTOR
set atomic B-factors of the input structure model to
the given value.
--prerefinement-shake-sites [SHAKE_SITES]
randomize coordinates of the input structure model
with the given mean error value. This is done by
default in the case of the complete cross-validation
protocol - mean error 0.25.
--prerefinement-no-modification
do not modify the input structure model before the
complete cross-validation protocol
Dependencies: CCP4 Software Suite or PHENIX containing CCTBX with Python 2.7
Example:
- Structure model: nuclease_model.pdb (has been previously refined at 2.0 Å),
- Diffraction data – merged: data_full_resolution.mtz (data up to 1.5 Å),
- Diffraction data – unmerged: XDS_ASCII_full.HKL (data up to 1.5 Å),
- High resolution limits: 1.9 Å, 1.8 Å, 1.7 Å, 1.6 Å, and 1.5 Å;
- External restrains: ligands.cif,
- Command file including external harmonics (REFMAC5 parameters): setting.com.
- X-ray weight: 0.04
- Number of refinement cycles to be performed during every resolution step: 15
- Project name: nuclease,
cctbx.python -m pairef --HKLIN data_full_resolution.mtz --XYZIN nuclease_model.pdb -u XDS_ASCII_full.HKL --LIBIN ligands.cif --refmac -c setting.com -i 2 -r 1.9,1.8,1.7,1.6,1.5 -w 0.04 --ncyc 15 -p nuclease
The command file setting.com is the following text file:
make -
check NONE
refi -
resi MLKF -
meth CGMAT -
bref MIXED
scal -
type SIMP -
LSSC -
ANISO -
EXPE
solvent YES
external harmonic residues from 3 B to 4 B sigma 0.03
exte dist first chain A resi 777 atom CD second chain A resi 777 atom OE1 value 1.20 sigma 0.01
PNAME nuclease
DNAME nuclease_42
Advanced options¶
Complete cross-validation¶
To run the paired refinement protocol for each individual free reflections set (e.i. to perform the complete cross-validation), use an option --complete
. The input structure model is modified to remove the bias of previous free reflection selection. The default setting is:
- the atomic coordinates are perturbed by an average of 0.25 Å,
- ADPs are set to their average value.
The modified model is then refined at the starting resolution, the number of refinement cycles is controlled by an option --prerefinement-ncyc
(20 cycles by default). To disable the automatic modification, use an option --prerefinement-no-modification
. For further information about the input model modification, see the section Modification of input structure model.
Problems¶
Something is not working? Are you worried that you did not understand well? Is an important feature missing? Do you like our project? Do not hesitate – please write us: martin.maly@fjfi.cvut.cz.