SCAssign | Daiwen Yang's Group

Synopsis:

SCAssign (Side-Chain Assignment) is a Sparky extension written in Python to assist the assignment of aliphatic side-chain resonances of uniformly ¹³C,¹⁵N-labled large proteins. It is based on a general strategy recently developed in our lab, which makes use of 4D ¹³C,¹⁵N-edited NOESY, MQ-(H)CC_mH_m-TOCSY, and the prior backbone assignments. SCAssign runs on virtually all operating systems on which Sparky is available, and is easy to install, setup and use. The above screenshots show how it looks in Linux.

Features:

SCAssign offers plenty of useful features that can greatly accelerate the assignment process. Many tasks that used to take weeks or even months can now be done in just a few days. These features include:

Import and isotope correction of chemical shifts from the prior backbone assignments
Friendly GUI for defining (H^N, N, C) spin triplets and examining the potential matching NOE peaks
Real time peak picking and automatic peak match to the defined spin triplets
Simultaneous view of both the current and the referential C-H planes for a more reliable assignment
One-click to assign/unassign NOE peaks and auto-aliasing of the assigned peaks
Strip plot in CCH-TOCSY of selected NOE peaks to confirm assignment or resolve ambiguity
Ability to identify and assign many weak NOE peaks using CCH-TOCSY

Availability:

The source code of SCAssign is provided as SCAssign (download) at this website. The download page also contains a step-by-step installation guide. Once you have it installed, carefully read the instructions on how to setup and use SCAssign, and then try out with your own data (infeasible for us to provide download of sample data set due to the huge file size of 3D/4D spectra). Some known issues with SCAssign are also listed out for your information, together with the workaround. Feel free to contact us if you need further support.

After extraction of the zip archive (~18KB), you should see four files with .py extension in the SCAssign folder, namely spectra_setup.py, import_shifts.py, sidechain_assign.py, and sparky_init.py. A brief description of each file is listed below.

spectra_setup.py
Setup spectra and preferences

import_shifts.py
Import chemical shifts and correct for deuterium labeling

sidechain_assign.py
Main program for side-chain assignment

sparky_init.py
Load SCAssign extension upon Sparky startup

`spectra_setup.py`	`Setup spectra and preferences`
`import_shifts.py`	`Import chemical shifts and correct for deuterium labeling`
`sidechain_assign.py`	`Main program for side-chain assignment`
`sparky_init.py`	`Load SCAssign extension upon Sparky startup`

Contents

Installation:

For more information on Sparky extensions, you can read the Extensions in Python section of the Sparky manual.

Normally all user-customized Sparky extensions are stored in the Python folder under a user’s home Sparky directory. If you have never installed any customized Sparky extension before, simply create a Python folder under your home Sparky directory, and copy all files in the SCAssign folder to this folder. After you have done, launch Sparky as usual (no compilation is required since Sparky will do that in real time) and you should find in the “Extensions” menu a new menu entry named “Sidechain assign” with an accelerator “sa”.

Please note that the above operation may overwrite any existing sparky_init.py file in the Python folder. So If you already have a few of your own Sparky extensions installed, here are the steps to add SCAssign to your extension list:

Copy only spectra_setup.py, import_shifts.py, and sidechain_assign.py to the Python folder.
Locate sparky_init.py in the same folder and open it with your favorite text editor.
Insert the following lines into the initialize_session() function and save the file. def sa_command(s = session):

import sidechain_assign sidechain_assign.show_sidechain_assign_dialog(s)

session.add_command("sa", "Sidechain assign", sa_command) You need to replace session with the parameter you used when defining the initialize_session() function.

How to use:

Format of Shifts File:

In order for SCAssign to use the chemical shifts from prior backbone assignments, first you need to prepare a file in plain text that specifies those shifts. The format of this shifts file is illustrated below.

`1`	`LYS`	`CA`	`56.304`
`1`	`LYS`	`CB`	`33.336`
`2`	`THR`	`CA`	`62.145`
`2`	`THR`	`CB`	`69.736`
`2`	`THR`	`H`	`8.236`
`2`	`THR`	`N`	`116.334`
`3`	`GLU`	`CA`	`56.124`
`3`	`GLU`	`CB`	`31.485`
`3`	`GLU`	`H`	`8.576`
`.`	`...`	`.`	`...`

As seen from the above example, the file usually contains multiple lines, each consisting of four data fields (i.e., residue number, residue name, atom name, and chemical shift) separated by spaces or tabs. The following table summarizes the required data format for each field. Please do comply with it to avoid error during import.

`Residue number:`	`Sequence number of the residue, counting from 1 and increasing monotonically`
`Residue name:`	`Amino acid name of the residue, in standard 3-letter or 1-letter code`
`Atom name:`	`In BMRB nomenclature, H for amide hydrogen, N for amide nitrogen, CA for α-carbon, ...`
`Chemical shift:`	`Value of the chemical shift, floating point number, can be positive or negative`

Error Checking:

SCAssign checks the shifts file for any format error before import. If an error was found, a message would be displayed indicating the location and type of the error (as shown below). In this case, edit the file to make corrections accordingly, and then try again.

Isotope Correction

In our situation, the prior backbone assignments were obtained from deuterated samples while the 4D NOESY and CCH-TOCSY experiments were conducted using non-deuterated samples. The C^α and C^β chemical shifts, therefore, need to be corrected for the isotope effect after import before they can be used to assign side-chain resonances. SCAssign has a feature of automatically performing this correction. To enable it, simply turn on the checkbox labeled “Correct for deuterium isotope effect” near the bottom of the window. You may read the following article if you want to know more details about deuterium isotope effect and the basis for its correction:

Venters RA, Farmer BT 2nd, Fierke CA, Spicer LD Characterizing the Use of Perdeuteration in NMR Studies of Large Proteins: 13C, 15N and 1H Assignments of Human Carbonic Anhydrase II J. Mol. Biol. 1996, 264(5):1101-1116

Define Spin Triplets:

SCAssign displays a protein sequence in a viewing window of three consecutive residues. The slider bar allows you to move forward and backward along the sequence. Each residue is identified by its one-letter amino acid code and sequence number. To begin with the side-chain assignment, first of all you need to define a (H^N, N, C) spin triplet for which matching peaks will be found later by SCAssign (at this point you may wish to refresh on the assignment procedure that was introduced earlier). This can be done intuitively by selecting H^N, N and C spins from the pull-down menus. As H^N and N of the same residue together define a C-H plane in 4D NOESY and would not make sense if separated, they are grouped in one entry for your convenience and labeled as “NH”.

In many cases the previous backbone assignment may contain gaps and holes. If a residue was left completely unassigned (i.e., chemical shifts of none of its spins are known), its pull-down menu is disabled and grayed out. The unassigned NH spins, such as those of proline residues, are also disabled on the menu. SCAssign uses empirical ranges for unassigned CA and CB and labels them in red (not shown in the demo). You may leave them out in the first round, and try to assign later after majority of the aliphatic side-chain resonances have been assigned. Only one spin triplet can be selected at a time. Selected spins are displayed on the menu buttons.

Peak Match Results:

Once you have defined a spin triplet, SCAssign will carry out the peak match in real time and return the result almost instantly. There might be a slight delay of up to a few seconds for triplets containing C spins other than CA and CB since a much wider empirical range needs to be scanned through. All matching peaks are sorted from strongest to weakest by the magnitude of data height, which is an estimate of peak intensity. Their frequencies are tabulated. The total number of peaks found and the chemical shift of each of the selected spins (the average chemical shift in the case of CG, CD, etc.) are shown in the status bar for your reference. The demo below provides some examples.

Besides listing the matching peaks in its peak list, SCAssign highlights in the spectrum view the region where it has searched to find those peaks. Since peaks often get folded in a multi-dimensional NMR spectrum, different colors are used here to mark the boundaries of the region, depending on whether the matching peaks in that region are positive or negative. The defaults are yellow for positive peak region and blue for negative peak region. You may choose other colors in preferences setting. The following examples are given to illustrate this color scheme. In the spectrum view on the left, the region is highlighted in yellow, which means the matching peaks are expected to be positive. In the spectrum view on the right, the region is highlighted in blue, which means the matching peaks are expected to be negative.

In some circumstances, the peak region may fold over the spectrum and split into two parts. The upper part spans from the top border of the C-H plane to the first line, and the lower part from the second line to the bottom border of the C-H plane, as shown in the examples below. They contain matching peaks of opposite signs, and hence are highlighted in different colors. In the spectrum view on the left, SCAssign searches for positive peaks in the upper part and negative peaks in the lower part, while in the spectrum view on the right, SCAssign searches for negative peaks in the upper part and positive peaks in the lower part.

The Algorithm:

For those who are interested, following is an outline of the peak match algorithm.

Calculate the peak match region based on the chemical shifts of selected spin triplet and the tolerances setting.
Alias the region onto the spectrum, and if it gets folded, determine the area of each part.
Determine the sign (+/-) of matching peaks in the region (or each part of the region if it gets folded).
Pick peaks in the region (or one part of it) by calling Sparky’s peak picking function.
Search among both existing and newly picked peaks for the ones whose center falls within the region (or one part of it).
Filter the result of step 5 for peaks above threshold and of the correct sign, and sort them by data height.
Repeat step 4 to 6 for the other part of the region if it gets folded, and display the final result.

Refine Peak Match:

If you expect some matching peaks but SCAssign fails to find any, or if SCAssign returns too many matching peaks for most spin triplet selected, you may want to consider refining the peak match. One way to do this is to adjust the parameters used for peak picking, as shown below. You can call up this dialog window either by accessing menu “Peak” >> “Picking…”, or typing the accelerator “kt”. Information about the meaning of these parameters and how they will affect the peak picking can be found in the Picking Peaks section of the Sparky manual. Please note that the setting will be used only for picking new peaks and hence has no effect on those that were picked up previously.

Alternatively, you can adjust the peak match tolerances, which determine the size of the region where SCAssign will search for potential matching peaks. The bigger the tolerances, the wider the region that SCAssign has to scan through, and the longer it takes to get the result. So please set according to the condition of your spectrum a moderate value, which on one hand allows reasonable deviation to cover the real match, and on the other hand, prevents an overwhelming list of potential matching peaks.

When filtering peaks, SCAssign use the lowest contour levels as the thresholds. You may change these values in the contour dialog at any time during the side-chain assignment, and unlike minimum linewidth or drop off factor, this setting will affect both new and existing peaks.

Close View of Peaks:

The peaks listed in SCAssign’s peak list are just potential candidates of a real match to the selected spin triplet, and you need to examine each of them before making the assignment. For the majority of CA and CB, only one or two peaks will be identified as possible matches if the peak match criteria are set properly. More matching peaks may be found for other C spins whose exact chemical shifts are unknown from the prior backbone assignment and empirical ranges according to BMRB statistics are used instead. To take a look at each peak, simply click on the respective entry in the peak list. SCAssign will automatically switch to the C-H plane where the peak resides and indicate its position with a crosshair. You can, of course, type Sparky command “zi” to zoom in and get a more detailed view of the shape and contour of any peak that catches your interest.

Reference Plane:

SCAssign has a handy feature to help you quickly confirm an assignment or resolve ambiguities caused by degeneracy of the selected spin triplet. When you click on an entry in the peak list, besides locating the peak on its C-H plane, the program automatically generates another view showing the reference C-H plane and place the crosshair at the same C-H coordinates, as illustrated in the previous demo. The reference plane is chosen according to the composition of the selected spin triplet.

For (H^N_i, N_i, C_i), the plane defined by N_i+1-H_i+1 will be the reference plane.
In the above case, if i is the last residue, the plane defined by N_i-1-H_i-1 will be the reference plane.
For (H^N_i, N_i, C_j) where i ≠ j, the plane defined by N_j-H_j will be the reference plane.
Should the NH falls on Pro residue, an empty view will be displayed.

The following example shows how reference plane can help resolve ambiguous assignments, based on the principle of reciprocal confirmation from intraresidue and sequential NOEs that was introduced earlier. Here we have two peaks identified for the spin triplet (H^N, N, CA) of D87 in a maltose binding protein (MBP, 370 residues, 42 kDa). SCAssign displays the C-H plane defined by the NH of K88 as the reference plane. To determine which peak (intraresidue NOE) is the real match, we need to check for the presence of its corresponding peak (sequential NOE) on the reference plane. The spectrum view below on the left centers at the first peak, while in the view on the right a corresponding sequential NOE peak can be clearly seen on the reference plane.

In contrast, no corresponding sequential NOE peak can be observed on the reference plane (shown in the spectrum view below on the right) for the second peak (shown below on the left). With such information, it is fairly safe to conclude that the first peak is the real match and to assign its aliphatic proton shift as the chemical shift of HA.

For the assignment of HA and HB, most ambiguities can be readily resolved by comparing with the reference plane. This method can also be used to confirm an assignment. When the 4D NOESY spectrum alone fails to unambiguously assign an aliphatic side-chain atom, often for CG, CD, etc., strip plots from the CCH-TOCSY spectrum can be drawn to help resolve the ambiguities.

Assign Peaks:

Once you have identified which peak is the real match, you can assign it by Shift-clicking (press and hold the “Shift” key while clicking) on its entry in SCAssign’s peak list instead of manually filling in the group and atom names for each axis in the “Assignment” dialog window (accelerator “at”). SCAssign will generate the assignment label according to the selected spin triplet, and display it in the spectrum view next to the peak. At the same time, an asterisk mark will appear at the end of the entry to indicate that the peak has been assigned. You may increase the size of the labels in the “Ornament Sizes” dialog window (shown below, accelerator “oz”) If it is too small to be seen. To unassign a peak, simply Shift-click on its entry again. Both the assignment label and the asterisk mark will then be cleared.

When a peak is assigned, SCAssign will check its frequency in each of the four dimensions and alias it if necessary to bring the value closest to the expected frequency. This step would be ignored if the peak was aliased by the user prior to the assignment. For H^N, N, CA and CB, the expected frequency is taken as the respective chemical shift obtained from backbone assignments; for CG, CD, etc. and all side-chain protons, it is taken as the respective mean chemical shift in the empirical range.

Many amino acid residues contain side-chain carbon atoms that carry more than one hydrogen atoms (e.g., CA in Gly has HA2 and HA3). They are reflected on the 4D NOESY spectrum as distinct peaks with the same aliphatic carbon shift but slightly different aliphatic proton shifts. Since in this case SCAssign is unable to differentiate one hydrogen atom from another, it will simply assign those peaks with the same label. To append a suffix number to a hydrogen atom, you need to edit the label in the “Assignment” dialog window.

Edit Peaks:

Most of the time the auto-alias performed by SCAssign gives good results and hence you save the hassle of doing it manually. In case of wrong alias, you can correct using Sparky command “a1”, “a2”, … or “A1”, “A2”, … (refer to Aliased peaks in the Sparky manual). Please be reminded that all aliases made to a peak will be cleared when you unassign the peak by Shift-click.

As you may have noticed, in fact all Sparky commands is callable from within SCAssign. Besides zooming in and out of a spectrum view or aliasing a peak, you can type “lt” to call up a peak list for the currently selected spectrum, or “js” to save the project, and so on. In addition, you may still use the function keys (F1 to F12, some keys may not work on certain machines) in SCAssign to change the pointer mode.

SCAssign always highlights the entry of the selected peak in its peak list. This is true even when you select multiple peaks in the spectrum view. If a peak is dragged, SCAssign will immediately update to show the new position and data height, and re-sort its peak list accordingly. In consistent with way that Sparky works, you can delete unwanted peaks in SCAssign simply by pressing the “Delete” key.

Plot Strips:

Before plotting strips, please make sure that you have properly configured the X, Y and Z dimensions of the CCH-TOCSY spectrum during setup. SCAssign constructs a strip plot by first locating its position on the X-Z plane using the above mentioned C-H frequencies, and then displaying the spectrum view along the Y dimension. You may notice that the “Strip plot” extension itself provides a “Show peak strips” dialog window (accelerator “ss”, refer to How to Show Strips in Sparky manual for a full description and screenshot), where you can choose the spectra for which strip plot will be performed and define how their axes correspond to the X, Y and Z dimensions. Do not confuse this with SCAssign’s own spectra setup. In fact all settings made in this window will be totally ignored by SCAssign in strip plot. They are neither sufficient nor necessary.

Select and right click on any peak in SCAssign’s peak list will show the corresponding strip in CCH-TOCSY. A “Strip plot” window will be created if no one exists. Otherwise, it will be restored (if previously minimized) and raised on top of all other windows. The newly drawn strip is highlighted in a color frame. SCAssign labels each strip at the bottom with the respective C-H frequencies, the identity of the residue containing the C-H group, and the position of the C-H group (α, β, γ, etc.). It also aliases the C frequency to get the expected position of that C-H carbon atom on the Y axis of the spectrum, and marks the position in the strip with a bright line for easy comparison.

The auto-alias for calculating the C-H frequencies of a strip is performed only on unaliased peaks and affects only the strip. The peak itself will not be auto-aliased until it is assigned by the user. Once a peak is aliased, SCAssign will instead use the C-H frequencies of that peak for strip plot. In addition, SCAssign synchronizes all strips with the respective peaks, so at any time you can drag a peak to adjust its position, or manually alias it if you are not satisfied with the result of the strip plot based on auto-alias. The strip will be redrawn and updated immediately to reflect such changes. Sometimes the “closest to the expected value” principle fails and SCAssign may mistakenly alias to get the C-H frequencies for strip plot while they are in fact identical to the on-spectrum frequencies of the peak. If this happens, just turn off the auto-alias feature by deselecting the checkbotton located on the upper right corner of the “Strip plot” window as shown below, and plot the strip again.

Analyze Strips:

The “Strip plot” window is interlinked with SCAssign’s peak list and the spectrum view to allow convenient analysis and cross-check. For example, when you click on an peak entry in the list, if it has a strip but the strip is not currently displayed in the plotting area, SCAssign will automatically scroll to that strip and highlight it. Similarly when you double-click on a strip, SCAssign will show the corresponding peak in the spectrum view and select it, pinpoint with a crosshair the same C-H position on the reference plane, and highlight the peak entry in the list if present. A Flash demo is available below. Please note that the reference plane is not shown due to space constraint.

For any strip in the strip plot, once its corresponding NOE peak is deleted, the strip will be deleted as well. If you just want to delete the strip without affecting the peak, click to select the strip first (which will be highlighted upon selection) and then type the command “sd”. Typing “sD” will delete all strips. You can zoom in or out on a selected strip using the command “si” or “so”. You can also type “sw” to bring up a dialog window for adjusting the strip width and the gap between the strips. All these commands are accessible under the “Show” menu as highlighted below. For more details on the functions provided by the “Strip plot” extension, you may read the Strip Plot section of the Sparky manual.

Based on the general strategy that will be introduced later, the strips can be compared and analyzed to confirm assignments or resolve ambiguities that could not be resolved with the 4D NOESY spectrum alone. The same as with entries in SCAssign’s peak list, Shift-click on a strip will assign the corresponding NOE peak and auto-alias it closest to the expected frequencies.

Identify Weak NOE

Not only does SCAssign speed up the process of assigning aliphatic side-chain resonances in large proteins, but also it produces a more complete set of assignments, since with the help of SCAssign you can now assign some side-chain resonances that could not be manually assigned before. In this section, we will present an example in which weak NOEs between an amide proton and aliphatic protons at the distal end of a side-chain can be identified by SCAssign and used for resonance assignment.

Our general strategy to assign aliphatic side-chain resonances was developed based on the statistics of interatomic distances which indicate that nearly all HAs and HBs, many HGs, and some HDs will give rise to both intraresidue and sequential NH-CH NOEs. However, a number of such NOEs, especially those involving HDs and HEs, may appear too weak due to their usually longer distances to amide protons, and hence may not be observed in the contour plot with threshold set for manual analysis. The spectrum view below on the left shows a C-H plane from 4D NOESY defined by the NH of T2 in a maltose binding protein (MBP, 370 residues, 42 kDa). The contour plot has a threshold of 2.4×10⁶ for both positive and negative peaks, a setting used most of the time during manual assignment to maximally eliminate background noises. With SCAssign, we can quickly assign HA and HB of K1 by searching for peaks whose chemical shifts match the triplet (H^N₂, N₂, CA₁) and (H^N₂, N₂, CB₁). Possible peaks for CG and HG of K1 can be similarly found by SCAssign using the empirical range of CG’s chemical shift, and the ambiguities are resolved with additional information obtained from the strip plot. However, when we are trying to assign CD, HD, CE and HE of K1, the program returns no possible peaks at the current threshold.

In an attempt to assign these resonances, we first lower the threshold of the contour plot to 1.2×10⁶ for the 4D NOESY spectrum. As a result, more peaks emerge and meanwhile we start to see some noises (as shown above on the right). We then search for possible peaks again in SCAssign, and this time 10 are found for (H^N₂, N₂, CD₁) and 8 for (H^N₂, N₂, CE₁). Right click on each entry in the peak list to display the strip plot, the peak whose corresponding strip contains no meaningful pattern is most likely the noise and hence can be deleted straight away. The remaining strips are compared with reference to the strips of K1A, K1B and K1G to resolve ambiguities on the basis of pattern matching. In the following screenshot (due to space constraint only representative strips of K1D and K1E are shown here), it is not difficult to realize that the peak pattern in the 3rd strip of K1D and the 2nd strip of K1E (counting from the left) aligns most well with that in the reference strips. Therefore, the C-H frequencies of these two strips can be respectively assigned as the chemical shifts of CD, HD, CE and HE.

General Strategy:

Please refer to the paper below if you want a comprehensive description.

Xu, Yingqi; Lin, Zhi; Chien, Ho; Yang, Daiwen. A General Strategy for the Assignment of Aliphatic Side-Chain Resonances of Uniformly 13C,15N-Labeled Large Proteins. J. Am. Chem. Soc. 2005, 127(34):11920-11921

Traditionally, structure determination by NMR is suitable for small proteins (<25 kDa) of which backbone and side-chain assignments can be obtained using uniformly ¹³C,¹⁵N-labeled samples with triple resonance experiments. Such approach often fails when applied to proteins larger than 30 kDa due to increased transverse relaxation rates. With the introduction of deuteration and TROSY techniques, it is possible to assign backbone and ¹³C^β resonances for proteins up to 100 kDa. However, deuteration also significantly reduces distance constraints derived from NOEs among aliphatic and aromatic protons, leading to low resolution structure. Although more long-range NOEs can be observed by selectively reintroducing methyl protons into otherwise deuterated samples since methyl groups are often involved in hydrophobic cores, preparation of deuterated and methyl-protonated samples is always costly and time-consuming and may not be suitable for every protein. To further improve structure resolution, it is necessary to constrain side chains of all or most residues using NOEs among protons located at side chains. This implies that complete or partial protonation at most side chains is inevitable.

For fully protonated large proteins, our group has developed a novel experiment MQ-(H)CC_mH_m-TOCSY and a strategy to assign side-chain resonances of methyl-containing residues. The method has been successfully applied to a 42 kDa AcpS trimer and a 65 kDa chain-selectively labeled hemoglobin. So far, no method can be applied to assign side-chain resonances of residues that contain no methyl groups.

The new strategy introduced here makes use of 4D ¹³C,¹⁵N-edited NOESY and prior backbone assignments to assign side-chain resonances of all residues in uniformly ¹³C,¹⁵N-labeled large proteins. Although most triple resonance experiments involving both ¹³C and ¹⁵N spins have very poor sensitivity for protonated large proteins, NOESY experiments are still sensitive enough to provide through-space correlations between spins separated by 4.5 Å or less (5.5 Å for methyl groups). Statistics on interatomic distances indicate that nearly all intraresidue H^N_i-H^α_i, H^N_i-H^β_i and sequential H^N_i-H^α_i-1, H^N_i-H^β_i-1 NOEs can be observed. In 4D NOESY, each amide correlates with a number of CH_n groups at positions [ω(H^N_i), ω(N_i), ω(C^k_j), ω(H^k_j)], where ω is the chemical shift; k is the kth carbon/hydrogen of residue j. H^α and H^β can be assigned from intraresidue or sequential NOEs, provided that these NOEs are distinct from other interresidue NOEs on the basis of prior assignment of H^N, N, C^α, and C^β spins. Otherwise, ambiguities in assignment can be resolved using both intraresidue and sequential NH-CH NOE correlations. If the ambiguity cannot be resolved due to a lack of sequential or intraresidue NOEs, an MQ-(H)CC_mH_m-TOCSY experiment can be applied to confirm the assignment. Assignments of H^γ, H^δ, and H^ε are much more challenging since the chemical shifts of respective carbon spins are unknown. According to the statistics on H-H distances, many H^γs and some H^δs give rise to both intraresidue and sequential NH-CH NOEs and thus can be assigned in 4D NOESY. The remaining spins can be assigned by combining TOCSY and NOESY experiments.

This strategy was tested on a cell adhesion protein (DdCAD-1, 214 residues) and β-chain of human normal adult hemoglobin in the carbonmonoxy form (rHbCO A, ~65 kDa with two identical α-chains and two identical β-chains). By comparing the results with those obtained previously, it was shown that most aliphatic side-chain resonances of these large proteins can be reliably assigned.

The procedure for assigning H^α and H^β is as follows.

Step 1:
Identify peaks whose chemical shifts match the shifts [ω(H^N_i), ω(N_i), ω(C^α_i)] on the C-H plane in 4D NOESY defined by N_i-H_i. If only one peak matches, the aliphatic proton shift of this peak is presumably assigned as the chemical shift of H^α_i.

Step 2:
Similarly by substituting ω(C^α_i-1), ω(C^β_i) and ω(C^β_i-1) for ω(C^α_i) above, H^α_i-1, H^β_i and H^β_i-1 may be presumably assigned, provided that unique matches also exist (for CH₂ groups, two peaks with identical carbon shifts are also regarded as a unique match).

Step 3:
In step 1 and 2, if an assignment obtained from intraresidue NOE is consistent with that obtained from sequential NOE (Figure a, b on the right), the assignment is confirmed.

Step 4:
For an unconfirmed assignment on the C-H plane defined by N_i-H_i, if its C-H shifts match those of any peak on the plane defined by N_i+1-H_i+1 or N_i-1-H_i-1 (Figure c, d), the assignment is also confirmed.

Step 5:
When no assignment can be made in step 1 or 2 due to ambiguities, directly compare the planes defined by N_i-H_i and N_i+1-H_i+1 (or N_i-1-H_i-1) to resolve the ambiguities.

Since degeneracy of (H^N, N, C) spin triplets occurs in a much lower chance than that of (H^N, N) spin pairs, most H^αs and H^βs could be presumably assigned with only intraresidue or sequential NOEs (result table, columns A and B). For DdCAD-1, we found that all assignments obtained with both intraresidue and sequential NOEs (result table, column C) were correct, while three of the unconfirmed assignments (result table, column D) were incorrect. In rare cases where the above methods fail to unambiguously assign H^α or H^β, a 3D CCH-TOCSY spectrum can be used to resolve the ambiguities, as will be described in the next section.

The procedure for assigning H^α and H^β can be similarly applied to assign other side-chain resonances. Although exact chemical shifts of C^γ and C^δ are unknown, their empirical ranges may serve as a guide for locating possible peaks (Figure a, b on the right). However, due to the obvious problem of chemical shift degeneracy as well as usually longer distances to amide protons, less number of C^γH^γ_n and C^δH^δ_n groups can be assigned with 4D NOESY alone (result table, columns A and B). In this case, a 3D CCH-TOCSY spectrum has to be used in conjunction with 4D NOESY to assign any unconfirmed and remaining resonances (Figure c-g on the right). Details are given below.

Once a peak at position [ω(H^N_i), ω(N_i), ω(C^k_j), ω(H^k_j)] in 4D NOESY is unambiguously assigned (most often would be H^α and H^β peaks), draw its corresponding strip plot at [ω(C^k_j), ω(H^k_j)] in CCH-TOCSY and mark the C^k peak on the strip. Such strips will be our “reference strips”. Later on if ambiguities arise when assigning other resonances (e.g. C^γH^γ_n ) in residue j, similar strip plots can be drawn and compared for the number of matching peaks (as those joined by lines in Figure c-e and f, g). The more the matches a strip shares with the “reference strips”, the more likely that the NOE peak to which it corresponds is the correct one to receive the assignment.

The above method can also be used to assign H^α or H^β that could not be assigned previously with 4D NOESY alone, provided that there are other confirmed assignments from the same residue for comparison, or the strip plot itself provides enough information.

After all these steps, the aliphatic side-chain assignment can reach a completeness of up to about 96% in DdCAD-1 and 80% in rHbCO A (the ratio of the assigned to total aliphatic CH_n groups, result table, column E). Hence using our strategy, much more distance constraints may be obtained for accurate structure determination of large proteins.