2
$\begingroup$

I'm a computer scientist and have no biochemical background, but was working on a project recently that had me going through a lot of protein residues for one reason or other. Feel free to correct me from here on, but my understanding is that proteins, also called "peptides" also "subchains" when a part of a larger macromolecule are made of amino-acids(central dogma) which are also frequently referred to as "residues".

Upon closer (programmatic) inspection, each residue (amino-acid) turns out to be a bunch of atoms like so: enter image description here

I was wondering what is the significance of that CA there in each of the residues, and whether i identified it correctly as "alpha-carbon". Protein Data Bank docs refer frequently to a residue's alpha-carbon. If it is indeed the alpha-carbon, i have two questions which are probably dead-simple to anyone with biochem background:

  1. Why is that one(in space) considered alpha and not the carbon atom next to it?

  2. Is it a given that each residue out there has only one alpha-carbon? Seems pretty obvious, but please.

Thanks a ton!

$\endgroup$
4
  • $\begingroup$ The $\alpha$-carbon is the carbon attached to the carboxyl (-COOH) functional group. $\endgroup$
    – MaxW
    Commented May 27, 2020 at 17:02
  • $\begingroup$ @MaxW In this case these are connected with amide groups (besides the C-end, where it's indeed COOH). $\endgroup$
    – Mithoron
    Commented May 27, 2020 at 17:05
  • $\begingroup$ Huh, your output there seems to be indeed confusing. What's its source? $\endgroup$
    – Mithoron
    Commented May 27, 2020 at 17:13
  • 3
    $\begingroup$ The protein backbone runs as $\mathrm{\cdots C-N-C_\alpha -C-N-C_\alpha \cdots }$ but is more often written as $\mathrm{-C(=0)N(H)-C_\alpha(R_1)-C(=0)N(H)-C_\alpha(R_2)-}$ where the $R$ are the amino acid residues always attached to the carbon alpha, and the N atom always has a hydrogen and the C always an oxygen. The $C_\alpha$ are usually labelled $C_{\alpha 1}, C_{\alpha 2}$ etc. It would be easier to look up some pictures. $\endgroup$
    – porphyrin
    Commented May 27, 2020 at 17:45

2 Answers 2

3
$\begingroup$

Question 1: Why is that one(in space) considered alpha and not the carbon atom next to it?

All human proteins consist of $\alpha$-amino acid residues. An $\alpha$-amino acid means the carboxylic acid group ($\ce{COOH}$) and amino group ($\ce{NH2}$) are separated by one $\ce{C}$ carbom atom, which is called $\alpha$-carbon ($\ce{C}_\alpha$; See the insert at bottom right of the diagram):

Dipeptide

Usually, backbone of a protein ($\alpha$-helix) is written as: $$\ce{H2N-C_\alpha(R^1)-C(=O)-NH-C_\alpha(R^2)-C(=O)-NH-C_\alpha(R^3)-C(=O) -}\cdot \cdot \cdot \ce{-NH-C_\alpha(R^n) -COOH}$$

For example, the dipeptide in this diagram can be written as:

$$\ce{H2N-C_\alpha(R^1)-C(=O)-NH-C_\alpha(R^2) -COOH}$$

As demonstrated in these backbones, you'd see each $\ce{C_\alpha(R^1)}$ is in place between $\ce{NH}$ and $\ce{C(=O)}$ (the $\ce{R^1, R^2,}$ etc. are corresponding side chains of the particular amino acids). Thus, $\text{<Atom C>}$ next to $\text{<Atom CA>}$ ($\ce{C}_\alpha$) in the written chart of the program is referring to the carbonyl $\ce{C}$ of that particular amino acid. For example, let's consider three amino acids in the given chart: (10 Arginine; (2) Lysine; and (3) Leucine:

Amino Acids

Arginine (arg) residue in the chart is can be written as follows (the side chain in parenthesis):

$$\ce{-HN-C_\alpha(CH2CH2CH2NHC(=NH)NH2)-C(=O) -}$$

According to the nomenclature off the program, you can rewrite it as:

$$\ce{-HN-C^A(C^BH2C^GH2C^DH2N^EHC^Z(=N^{H1}H)N^{H2}H2)-C(=O) -}$$

where $\ce{A #} \alpha$; $\ce{B #} \beta$; $\ce{G #} \gamma$; $\ce{D #} \delta$; $\ce{E #} \epsilon$;$\ce{Z #} \zeta$; and $\ce{H #} \eta$ (following Greek alphabet letters). Since two nitrogen atoms are attached to $\ce{C}_\zeta$, they are appropriately labelled as $\ce{N^{H1}}$ and $\ce{N^{H2}}$ after next Greek letter $\eta$.

Since the program has avoided hydrogen, let's rewrite it again without hydrogen atoms:

$$\ce{-N-C^A(C^BC^GC^DN^EC^Z(=N^{H1})N^{H2})-C(=O) -}$$

Thus, program writes it as $\ce{N -C_\alpha -C(=O) -}$ first and then the the atoms in side chain next. Hence, Residue ARG: $\text{<Atom N>}$, $\text{<Atom CA>}$, $\text{<Atom C>}$, $\text{<Atom O>}$, and then the side chain in parenthesis as: $\text{<Atom CB>}$, $\text{<Atom CG>}$, $\text{<Atom CD>}$, $\text{<Atom NE>}$, $\text{<Atom CZ>}$, $\text{<Atom NH1>}$, $\text{<Atom NH2>}$.

Similarly, lysine (lys) residue in the chart is (side chain in parenthesis):

$$\ce{-HN-C_\alpha(CH2CH2CH2CH2NH2)-C(=O) -}$$

You can rewrite it according to the nomenclature off the program (avoiding $\ce{H}$s):

$$\ce{-HN-C^A(C^BC^GC^DC^EN^Z)-C(=O) -}$$

Thus, program writes it as $\ce{N -C_\alpha -C(=O) -}$ first again, followed by the atoms in side chain. Hence, Residue LYS: $\text{<Atom N>}$, $\text{<Atom CA>}$, $\text{<Atom C>}$, $\text{<Atom O>}$, and then the side chain in parenthesis as: $\text{<Atom CB>}$, $\text{<Atom CG>}$, $\text{<Atom CD>}$, $\text{<Atom CE>}$, $\text{<Atom NZ>}$.

For leucine, $\ce{-HN-C_\alpha(CH2CH(CH3)CH3)-C(=O) -}$,you can again rewrite the formula according to the nomenclature off the program (avoiding $\ce{H}$s):

$$\ce{-N-C^A_\alpha(C^BC^G(C^{D1})C^{D2})-C(=O) -}$$

Note that since two carbon atoms are attached to $\ce{C}_\gamma$, they are appropriately labelled as $\ce{C^{D1}}$ and $\ce{C^{D2}}$ after next Greek letter $\delta$.

Thus, program writes it as $\ce{N -C_\alpha -C(=O) -}$ first again, followed by the atoms in side chain. Hence, Residue LEU: $\text{<Atom N>}$, $\text{<Atom CA>}$, $\text{<Atom C>}$, $\text{<Atom O>}$, and then the side chain in parenthesis as: $\text{<Atom CB>}$, $\text{<Atom CG>}$, $\text{<Atom CD1>}$, $\text{<Atom CD2>}$.

Question 2: Is it a given that each residue out there has only one alpha-carbon?

As explain in above backbone of the protein, you'd find only one $\ce{C_\alpha}$ fir each amino acid (which is chiral).

$\endgroup$
1
  • 1
    $\begingroup$ This is even more exhaustive and to-the-point than what i was looking to pick up. Thank you! $\endgroup$
    – rtviii
    Commented May 29, 2020 at 1:10
3
$\begingroup$

This nomenclature is due to the fact that amino acids are carboxylic acids.

Near the carboxylic acid moiety, the carbon chain is unbranched and simple, so the positions are named like an unbranched, simple aliphatic carboxylic acid.

The carboxylic acid ($\ce{-CO2H}$) is not indicated with a position. But the carbon immediately next to it is $\alpha$. The one next to that is $\beta$. That should be enough to explain the $\alpha$ in your context because that carbon is the one immediately next to the carboxylic acid. In a protein or polypeptide, the carboxylic acid is most frequently converted to an amide in a peptide bond, but the Greek letter nomenclature is unchanged.

This nomenclature is used in other cases for example, with an $\alpha$,$\beta$-unsaturated carbonyl.

The most publicly visible use of this nomenclature is when we use $\omega$ to denote the end of the carboxylic acid chain. This is the origin of the term $\omega$-3 [unsaturated] fatty acid.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.