3 min read
16 Jan
16Jan

The World Intellectual Property Organization (WIPO) has launched a new standard for Sequence Listing on International Patent Applications for the Life Science field. This new Standard - ST.26 - is effective from July 1st, 2022 and addresses how to present amino acids and nucleotide sequences as part of the Patent Filing process.

WHY A NEW STANDARD FOR THE SEQUENCE PRESENTATION

One reason to have a new standard is the necessity of harmonies the information in public databases - INSDC (International Nucleotide Sequence Database Collaboration) - as DDBJ, EMBL-EBI, and NCBI. The IP offices disclose a sequence list from patent applications but, without an appropriate format/system to automatically exchange/search it, part of the information could be lost or misinterpreted

Furthermore, the standardization of the sequence features annotations is proposed to enhance the automatization and efficiency of sequence generation and validation in IP Offices.  Sequence features as ramifications and elements such as D-amino acids or nucleotide analogues are included in the new standard to fulfil their common appearance in the inventions. Now there is only one sequence format worldwide, aiming to speed out the filing process.

WHAT ELSE IS NEW IN ST.26 COMPARED TO ST.25 

  • The sequence is presented in XML format (see the example below).
  • Include important sequence features such as ramifications, D-amino acids, and nucleotide analogues.
  • The nucleotide sequences are no longer presented in a “mixed mode” containing the translation product below.
  • Amino acids sequences are represented in one letter code.
  • The “u” character is no longer used in nucleotide sequences, instead “t” is used for uracil in RNA and thymine in DNA.
  • The Invention Title can be included in different languages.
  • Short sequences of <10 specifically defined nucleotides and <4 specifically defined amino acids are forbidden.
  • Only the earliest priority application should be included.

TRANSLATE OR DO NOT TRANSLATE THE SEQUENCE LIST 

ST.26 standard requires the sequence list to be presented in XML format, which means that the list will contain labels and text that codified the document in a way that both users and IT systems can read and interpret it. In the coding language, there are elements, values, and attributes that describe the sequence features. When the value is a text "indispensable for the understanding of a characteristic of the sequenceis designated as "free text". Those may for example state the function of the sequence, its binding site, or source

Translation enters the game when the free text is required in a language different from the original document, in that case, it is a “language-dependent free text”. The translated value should have no more than 1000 characters. Examples of qualifiers that are considered language-dependent qualifiers are the cell and tissue type, the cultivar or plant variety, and the source organism (when the latest is a Latin genus and the specie’s name, it does not require translation). Qualifiers designated as “note” introduce a comment about the feature and may also be translated. The entire list of this type of element can be found in Tables 5 and 6 of ST.26 Annex I.

In the context of the sequence list, the elements with a language-dependent value are represented by “<INSDQualifier_value>” and the translation is included in “<NonEnglishQualifier_value>". 

For example, in a patent filed in Spanish, an illustrative element can be expressed as:

 <INSDQualifier_value>essential for recognition of cofactor</INSDQualifier_value> 

<NonEnglishQualifier_value>esencial para el reconocimiento del cofactor</NonEnglishQualifier_value>. 

The title of the invention may also be translated into the language of filing using the Latin alphabet. The title should contain only between 2 to 7 words and be included in additional “InventionTitle” elements. 

HOW TO PREPARE AND VALIDATE THE SEQUENCE LIST 

WIPO has designed an online tool named WIPO Sequence that allows the user to create and validate a sequence list that complies with ST.26. A sequence list can be imported from other projects or in FASTA format without directly editing the XML file. The information about organisms, qualifiers, and features can be selected from drop-down menus. It is also possible to export/import XLIFF files for the translation of the language-dependent qualifiers of the sequence list. XLIFF files are commonly used to manage translation projects in CAT tools (e. g. TRADOS and Memsource) that allow for storage and consistent management of translation projects. For more information about the use of CAT tools in Patent Translation read our blog below. 


Author: Kenia Salazar Díaz – Ph.D. Biochemistry | Life Science Translator 


REFERENCES: 

1. WIPO Standard ST.26. INTRODUCTION. Webinar training

2. STANDARD ST.26. Version 1.5.

3. WIPO Sequence Suite.

4. ST.26 Annex I.               

Comments
* The email will not be published on the website.