System to describe the chemical formulas for WEB.

Structural formulas

Rules to describe of structural formulas in easyChem system.

You can use the Substances reference to find structural formulas by name, molecular formula (empirical formula) or categories of chemicals.

Unfortunately, the description of the structural formulas can not be simple. But CharChem system can greatly simplify this task. It uses the principle of closest approach of the process of writing the code of the formula to the process of drawing this formula in pencil on paper.

Chains, nodes and bonds

Consider the formula of ethyl alcohol: CH3-CH2-OH This is a simple chain consisting of three nodes and two bonds. As seen, the node is described in the same way as a linear formula. A link displays a minus sign. In general, the description of the formula does not differ from the final submission. This is the principle of visual conformity.

Brief description of the bond in the space

Now picture the same formula, but vertically. CH3|CH2|OH Here in line recorded the same procedure that had to do with drawing the formula: three nodes and the two characters |, denoting the line drawn from the top down.
What is another ways for indicating of chemical bonds? There are symbols that represent the sloping bonds / and \: H/O\H Double bonds are indicated as follows: H2C=CH2 <-> CH2||CH2 <-> H2C\\CH2 <-> H2C//CH2 Triple bonds - similar: HC%CH <-> CH|||CH <-> HC\\\CH <-> HC///CH Horizontal triple chemical bond describes by symbol % or ≡ (U+2261).
But how to represent the bonds that are drawn from right to left or bottom-up? To do this, put an ` (grave accent) before the symbol of bond.:
H`-C`%N + Na`|O`|H = Na`\C`\\\N + H`/O`\H To avoid confusion in the notation \ `\ /` /, you can mentally check whether the chemical bond is drawn from left to right. If so, do not put an grave accesnt. Grave accent means the opposite direction - from right to left.

Auto-nodes and chains closure

Just as in the representation of skeletal structural formulas omitted hydrogen atoms is replaced by a carbon bond angles, can proceed to describe CharChem. For example, to draw a benzene molecule in the form of a hexagon, we write the characters in a row corresponding to the lines: \||`/`\\`|// Here nodes also have, although they are not visible. These are called automatic. But the calculation of molecular weight and empirical formula takes into account all the invisible atoms.
The second feature of this formula is that at the end of the chain, we come to the beginning. The latter being associated with the first node of the link, which is referred to //.

Description of several chains and references to bonds

Usually, it is impossible to describe the structural formulas in a single chain. Suppose you want to represent the formula of methane in the form of four hydrogen atoms bonded to a carbon atom. You can use two chains - the horizontal and vertical, which intersect at the carbon atom. Written this way: H-C-H; H|#C|H Use a semicolon to mark the end of one chain and the beginning of another chain. After it is possible to put spaces or end of the line in order to improve readability.
Another important feature - a reference to the carbon atom with the inctruction #C. This means that it is not necessary to create a new node, and it is necessary to use an existing one.

The numbering of nodes

If we want to portray a detailed formula for ethyl alcohol, we need the three chains and two intersections. In this case, we use the reference to the node number. The nodes are numbered from 1 in the order they appear in the description. Automatic nodes are no exception and are numbered in the same way as explicitly described.
Thus, ethanol: H-C-C-OH; H|#2|H; H|#3|H Here, the first node - a hydrogen atom, and references are on the second and third nodes. This carbon. By the way, the fourth node - is OH, unnecessarily between atoms is not specified connection. Fifth node - the first hydrogen atom in the second chain. Sixth node - the last atom of hydrogen in the second chain. Because the node #2 already has the number 2, and the second time is not numbered. Here are a couple of ways to describe the same molecule:
H|C|H; H|C|H; H-#C-#5-OH <-> H-C|H; H|#C-C|H; H|#-3-OH In the latter case, we see a negative number #-3. This can be useful when the nodes are not simply assume from the beginning, and from the end. Number -1 represents a hydrogen atom, which was announced just prior to #-3. Number -2 - the last atom of hydrogen in the second chain.

Node labels

There is another way to refer to the node. For this purpose, a node must be assigned a label.:
H|C:cntr|H; H-#cntr-H The label is assigned with a colon after the description of the node and consist of letters and numbers.


There is a second way to describe non-linear structures - branching. For example, acetic acid: H3C-C<//O>\OH The symbol < is placed immediately after the description of the node. After the symbol > as if we return the pencil to a node that was before the opening of the branch. You can open nested branches within branches. For example:: H\N</N<`|H>\H>|H Unfortunately, the characters < and > are predefined in html. And if we write </I> in the html-file, then   browser thinks it is the end tag <I>, even though we have not announced this tag.   But the branch with the atom of iodine will disappear from the formula. To prevent conflicts, you can replace the < and > to (* and *) . Record a phrase will be recognized correctly.   Moreover, the replacement of equivalent so that you can even use a combination. For example (*/I>

Regular polygons

You may have noticed that the short bonds definitions (which are written in the form of characters -\|/) are intended to describe regular hexagons like benzene.  And if we need to portray furan (pentagon) or Oxepin (heptagon), such constructions will not help us.  The simplest example is the ethylene oxide - a triangle: `=_p3O_p3 Here, first draw a horizontal double bond, and then begins to build on it a regular triangle. The two remaining bonds are built structures _p3. This means a single bond to a polygon with three vertices rotated clockwise about a previous relationship. Example is more serious - septangular Oxepin with three double bonds (which are indicated by pp):
`=_p7_pp7_p7O_p7_pp7_p7 <-> =_q7_qq7_q7O_q7_qq7_q7 Here are two options. One with a turn clockwise. Second - counterclockwise. To do this, use a similar design, but with the letter q. A number of pentagons can not write. Therefore furan recorded easier: -_pp_pO_p_pp Unfortunately, this record is not as simple as with a brief description. But the difficulty increases as you add new features.
Another example. Depict the formula of adenine. You combine the hexagon with a pentagon. Here are two options:
||_pHN_p_ppN_p/<`|NH2>\\N|`//N`\ <-> /<`|NH2>\\N|`//N`\`||_qN_qq_qHN_q

Repeated chemical bonds

Let's look at a way of developing the formula of ethylene: H\C|C`/H; H`/#C|\H There are two chains have two common carbon atom. In the second chain of the first carbon atom is declared through a pointer. This is necessary because otherwise the chains are not connected. But the next atom is combined by geometric matching. Single bond, held twice in one place turns into a double. This is true for cycles. For example, ortho-cresol depict a straight-chain

Dummy bonds

It is possible to describe a dummy bonds. To do this, add a 0 (zero) after the description of bond. It's like a mean multiplicity of the bond is zero. Imaginary pen leaves the paper. But moving is saved.
Let us depict the formula pyrene with one chain without branching. Step 1. Only single bonds:
|`/`\`|`\`/|\/`|/`|/\|`/`\`|`\`/| Pencil passed twice in the places where there are double bonds. In one place, this is what you need. But in another, remove the extra link, adding 0. Oh and add to the double bonds, where it is needed. We get the final version:
|`//`\`|0`\\`/||\//`|/`|/\\|`//`\`|`\`//| Also this method can represent formulas of substances having an ionic bond.
For example, potassium carbonate: K^+\0O`^-/`|O|\O^-/0K^+

Universal bond

We have already met with a brief bond description and polygonal bonds. But there are times when their capacity is insufficient.You can use a universal description of bond for complete control over the formation of the structural formula. It is in the general form is written as: _ ( options ). Parameters are written separated by commas inside parentheses. Each parameter begins with a character that determines its function.

Cartesian coordinates

The easiest way to specify the position of a node relative to another node is the use of x and y coordinates.
If you specify only x or only y, then the value of the other coordinate is equal to 0.H3C_(x2)OH This methanol with bond length is 2 times longer than usual. Used parameter x, which has a value of 2.
Consider a more complicated example. We will use the move in two dimensions for the image of the molecule D-deoxyribose:
_(x-1,y1)_(x-1)<|OH>_(x-1,y-1)<`|HOCH2>_(x1.5,y-0.5)O_(x1.5,y0.5)`|OH The coordinate system is the standard for computer graphics. The X-axis from left to right, the Y axis from the top down. The same way as all other areas in CharChem.
However, there are not enough traditional thickening of chemical bonds, which is commonly used in the representation formulas of sugars. But it's easy to fix. Adding a parameter to indicate the expansion and contraction of lines: W+ и W-.

Angular coordinates

Another way to indicate the relative position of nodes is to use angle and line length. Parameters used:
  • A - absolute rotation in degrees about the axis from left to right clockwise. A90 - down, A-90 - up, A180 - back. If not specified, the value is treated as 0.
  • L - length of the line. L2 - Line 2 times longer than usual. If not specified, will be equal to the standard length (usually 1, but can be overridden by using the $L function)
Example: H_(A45)C<_(A135)H>_(L1.2,N2)C<_(A-45)H>_(A45)H Here we see another new parameter N. It sets the multiplicity of the bond. For triple bond must specify N3. And for a dummy bond - N0.


To display the chirality can use universal bonds with parameters:
  • w+ to zoom in (from wide) or w-, if you want to show thickening against the bond direction
  • d+ to zoom out (deep, dash) or d-, if in the opposite direction.
Consider as an example Bromochlorofluoromethane:
Cl|C<_(A160,d+)H><_(A100,w+)F>_(A20)Br = Cl|C<_(A160,d+)Br><_(A80,w+)H>_(A20)F != Cl|C<_(A160,d+)Br><_(A80,w+)F>_(A20)H
But there is a simpler way - use the short notation with the suffix w or d. For example, β-D-Arabinopyranose : \/O`|</dOH>`\<`|dOH>`/<`\wHO>|`/wHO

Delocalized π-bonds (a circle inside the polygon)

If you want to display the delocalized π-bond in a ring, apply a design _o. This construction immediately following the closure of the cyclic chain and ignores branches.
Consider the example of phenol, where the group OH deliberately taken out to the branch for demonstration purposes.
\</OH>|`/`\`|/_o Naphthalene: /\|`/`\`|_o`\`/|\/_o

Abstract elements

Sometimes you need to represent not the entire formula, but only part. This is useful for polymers. Or you want to express some common part of a group of materials with similar properties. For these cases are abstract elements that constitute the text in braces.
Carboxylic acid: {R}-C<\OH>//O Polyethylene: {...}-CH2-CH2-CH2-CH2-{...} If you need to represent a radical, then do not use the bond without the node at the end. In this case, the node will be generated automatically. Instead, use an empty abstract element.
Sulfoxide: H-O-S-{}; O||#S||O

Features that improve the appearance.

There are some functions to modify the appearance of the formulas. They all have similar syntax: $name(parameters). If there are no parameters, the brackets are still used $name(). Let us consider some of them.

The angle of bonds

Brief description of bonds that are described by \ and / symbols have a feature: their angles is automatically selected: either 30°, or 60°, depending on the neighboring bonds. But there are times when you want different behavior. Function $slope(angle) defines the angle explicitly to all of the following links. You can use the $slope(), to return to the original behavior.

The coefficient of the bond length

Function $L(number) allows you to change the length of each of the following bonds with a brief description and universal bonds where you use parameter A, but not use parameter L.


There are a number of functions that allow highlights different parts of the formula. The simplest: $color(color). The color is defined by names: red, green, blue ... or by numeric description #F00, #0F0, #00F ... Similarly, as is common in CSS and HTML. Calling $color() returns the original color.

Hydrogen bonds

If you put a symbol h after a brief notation connection, you get the hydrogen bond. For example, formic acid dimer:
O`//HC\O-H-hO//CH`\O`-H`-h#1 Perhaps to better display should increase the length of the hydrogen bond. You can use the universal bond description with the H parameter: _ (H). Depict the hydrogen bonds between molecules of adenine and thymine in DNA:
H\<-H>`/\\N`/`=N`\/_qN_qq_qN<`/{R}>_q_q-; #3_(x2,H)O\\-</>\\`/N<\{R}>`-<`//O>`\N</>`-H_(H)#5 Here, the first chain - adenine. Top hydrogen bond _ (x2, H) has a double length. The second _(H). Length is not specified as position of nodes already defined.

Coordinate bond with arrow

To display the bond with an arrow you can use a brief description with the letter v.
H3N-vPt`|Cl; NH3`-v#Pt|Cl In this example, we had to use two chains, as for a brief description can not change the direction of the arrow. That is, the arrow can only be due to the end, not the beginning. Universal description has no such restrictions. Parameter C indicates that the bond - coordinate _ (C). If the arrow should be sent back to front line use _ (C-). A record _ (C +) will make the arrows at either end of the line.
Cl$L(1.4)|Ni|Cl; H3N\v#Ni_(A30,C-)NH3; H3N/v#Ni_(A-30,C-)NH3

Lewis structures (Dots around elements)

There are two ways to describe points:
  • List the angles in degrees. For example: $dots(0,90,-90,180)C
  • Or you can use letters. Pairs of points are described with the following capital letters:
    L - Left, R - Right, U or T - Up or Top, D or B - Down or Bottom.
    $dots(L){L} + $dots(R){R} + $dots(U){U} + $dots(T){T} + $dots(D){D} + $dots(B){B}
    If you need to use not a pair, but only one point, then after the capital letter, a qualifying lowercase letter from the same list is placed:
    $dots(Lu){Lu} + $dots(Lt){Lt} + $dots(Rt){Ru} + $dots(Rb){Rb} + $dots(Dl){Dl} + $dots(Br){Br} + $dots(Ul){Ul} + $dots(Tr){Tr}
    Literal descriptions can be combined without using commas. Here is a more complex example:
    The exclamation point allows you to get the reverse description. For example, !L - everything except the left pair, !Lb - everything except the bottom left dot. And simply ! are all 8 points.
    $dots(!L){L} $dots(!Lb){Lb} $dots(!){A}
In addition, in the parameters you can specify the color of the points
As you can see, the color is set by a parameter that starts with c:. The color affects all the points declared below, but only inside the parentheses. An empty color means canceling all colors declared earlier.
In general, the color of the dots depends on the following conditions in descending order of importance: function parameters, $itemColor(), $color(), base color of the formula.

Using of dummy bonds for comments

Sometimes it is desirable to place some comments into a description of formula. You can use dummy bonds with nodes that represented by comments.
If the dummy bond describes by description /0 or (A45, N0), the line will be absent. But there is a way to explicitly specify the appearance of the line: the parameter S. S| means a solid line, S: - intermittent.
Now a quick look at comments. They are enclosed in double quotes. But in the comment text, you can use a reference to CharChem dictionary. It uses the key taken in backquotes. The key may be the designation of a chemical element, and the result will be the name of this element on current language of the page. As an example, depict the hydrogen bromide and hydrogen blue sign, and bromine - lilac, using a dotted line to communicate with the corresponding atom comment:
H-Br; $color(blue)#H_(x-1,y-1,S:)"`H`"; $color(magenta)#Br_(x1,y-1,S:)"`Br`"

End of article. Author: PeterWin. Date of last update: 2022-09-22