InChI: Advancing Discovery in Chemistry

We’re excited to have the opportunity to spotlight the International Chemical Identifer (InChI), a project of IUPAC and the InChI Trust. This descriptor aims to make naming conventions for chemical compounds and reactions more streamlined. Here, Josef Eiblmaier, Valentina Eigner-Pitto, Hans Kraut from InfoChem and Samuel Winthrop, an expert in the field, explain how InChI is helping researchers standardize results and how Springer Nature chemical content will be more readily available to the public.

Written by Josef Eiblmaier, Valentina Eigner-Pitto, Hans Kraut, and Samuel Winthrop

What’s in a name? Among chemical substances – quite a lot. As researchers continue to elucidate the structures of compounds, synthesise new molecules, and work out the chemical building blocks of the world we live in, an ever-growing list of names and naming conventions complicate the efforts to establish standards that science relies upon.

DOI for chemicals

As computational methods are becoming a vital part of research, a standard yet practical way to describe chemical substances becomes even more essential – a machine-readable standard needs specific rules to be useful and efficient!

Introducing InChIAn InChI is a string of characters derived solely from a structural representation of a chemical substance that is capable of uniquely representing the chemical substance and serving as its unique digital ‘signature’. Just like how a DOI relates to a specific article or chapter, and only that chapter, an individual InChI will uniquely identify a chemical substance, without ambiguity, providing a precise, robust, structure-derived tag for chemical substances.

Stephen Heller, one of the creators of InChi says: “Over time, chemists have created a Tower of Babel of chemical names. Due to creation of vast amounts of information in electronic form, it is a nightmare to locate information and data.  Having different names for the same chemical makes it very difficult to find all the necessary information chemists need for their work.  Hence NIST* and IUPAC** developed the InChI algorithm and made it freely available as Open Source so that chemists could use it free of charge. We believe strongly that this is the only way InChI can be used by all and can become a standard.”

*,** NIST (National Institute of Standards and Technology) is a US agency that defines standards and IUPAC is the International Union of Pure and Applied Chemistry.

An example InChI (in this case, for ethanol):

InChI=1/C2H6O/c1-2-3/h3H,2H2,1H

However, these strings can get extremely long. To simplify indexing chemical structures in databases, and to make chemical structures easily searchable on the Internet, the InChIKey– a hashed version of the string reducing it to a more manageable 27 characters, is used.

Some examples of InChIs for commonly used chemical compounds:

Chemical Compound InChI InChI-Key
Methanol 1S/CH4O/c1-2/h2H,1H3 OKKJLVBELUTLKV-UHFFFAOYSA-N
Ethanol 1/C2H6O/c1-2-3/h3H,2H2,1H LFQSCWFLJHTTHZ-UHFFFAOYSA-N
Acetone 1S/C3H6O/c1-3(2)4/h1-2H3 CSCPPACGZOOCGX-UHFFFAOYSA-N
Aspirin
1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)
BSYNRYMUTXBXSQ-UHFFFAOYSA-N

The application and importance of InChIs in chemistry and related disciplines are still being explored. Read more here.

RInChI – Reaction InChI

The RInChI organizes  InChIs involved in chemical reactions in a unique representation providing one layer each for reactants, products, agents (catalysts, solvents, etc.), and the direction of the reaction. This makes the RInChI a precise, robust, structure-derived tag for chemical reactions.

InChI at Springer Nature

InfoChem, a subsidiary of Springer Nature, which focuses mainly on storage and retrieval of structure and reaction information, has been involved with RInChI since its initial session in 2008. The first version of RInChI was finalized in March 2017 and is now available for the public.

Springer Nature is a member of the InChI Trust (a not-for-profit that works on development of the standard in collaboration with IUPAC) plans to use this standard for additional information and metadata in its chemistry journals, books and databases.  InChI keys for Springer Nature chemical content will be available in public, via open search portals such as PubChem or linked open data platforms such as SciGraph, thereby opening up data for the wider research community.

The RInChI group will present its experiences with RInChIs and RInChI-keys for large datasets (InfoChem SPRESI) along  with other RInChI use cases to the public in the workshop “Status and Future of the IUPAC InChI” to be held at the NIH in Bethesda (Washington DC) on 16th to 18th of August 2017.

How do you think InChI will help you in your work?