A subscription to JoVE is required to view this content. Sign in or start your free trial.
This article describes the use of a software application, mAbScale, for the calculation of masses for monoclonal antibody-based protein therapeutics.
Biotherapeutic masses are a means of verifying identity and structural integrity. Mass spectrometry (MS) of intact proteins or protein subunits provides an easy analytical tool for different stages of biopharmaceutical development. The protein's identity is confirmed when the experimental mass from MS is within a pre-defined mass error range of the theoretical mass. While several computational tools exist for the calculation of protein and peptide molecular weights, they either were not designed for direct application to biotherapeutic entities, have access limitations due to paid licenses, or require uploading protein sequences to host servers.
We have developed a modular mass calculation routine that enables the easy determination of the average or monoisotopic masses and elemental compositions of therapeutic glycoproteins, including monoclonal antibodies (mAb), bispecific antibodies (bsAb), and antibody-drug conjugates (ADC). The modular nature of this Python-based calculation framework will allow the extension of this platform to other modalities such as vaccines, fusion proteins, and oligonucleotides in the future, and this framework could also be useful for the interrogation of top-down mass spectrometry data. By creating an open-source standalone desktop application with a graphical user interface (GUI), we hope to overcome the restrictions around use in environments where proprietary information cannot be uploaded to web-based tools. This article describes the algorithms and application of this tool, mAbScale, to different antibody-based therapeutic modalities.
Over the past two decades, biotherapeutics have evolved to become a mainstay of the modern pharmaceutical industry. The SARS-CoV2 pandemic and other life-threatening conditions have further increased the need for the faster and broader development of biopharmaceutical molecules1,2,3.
The biotherapeutic molecular weight is critical for the identification of the molecule, in combination with other analytical assays. The intact and reduced subunit masses are used throughout the discovery and development lifecycles as part of control strategies aimed at maintaining the quality, as described in the QTPP (Quality Target Product Profile)4.
Analytical development in the biopharmaceutical industry relies heavily on mass measurements for intact mass analysis and deep characterization using peptide mapping or multi-attribute method (MAM) monitoring. At the center of these techniques utilizing modern mass spectrometry (MS) platforms is the ability to provide high-resolution accurate mass (HR/AM) measurements. Most HR/AM instruments yield mass accuracies in the range of 0.5-5 ppm, which scale with the mass range. The ability to measure masses accurately for intact large molecules enables the quick and confident identification of large-molecule therapeutics. As isotopic resolution cannot be attained using typical experimental conditions for large molecules (>10 kDa), average masses must be calculated for comparison and identification5,6.
A typical intact or subunit protein mass spectrum represents the overall proteoform profile, which contains composite information on the various molecular forms resulting from post-translational modifications (PTM) and any primary structure differences, such as clips or sequence variants. The relatively easy and high-throughput nature of these measurements make them attractive for characterization and as in-process monitoring controls7,8. Data analysis for these experiments usually requires the user to define the search space for molecular forms (range of PTMs or other molecular forms). For glycosylated proteins, this search space is largely driven by glycoform heterogeneity. Combinations of multiple PTMs, disulfide bond configurations, and other variations along the primary structure make calculating all the possible molecular forms a tedious task. Therefore, the manual calculation of the possible molecular forms is a time and resource-consuming process with a high potential for human error.
Here, we present a mass calculation tool that was developed considering the most important features of biotherapeutic molecules, such as mAbs, bsAbs, ADCs, etc. The tool allows the easy incorporation of search-space variables for the consistent calculation of masses and elemental compositions. The modular nature of this tool will enable it to be further developed and applied to mass calculation and mass matching for other modalities.
The GUI module allows the user to specify the input for the mass calculation, as shown in Figure 1; specifically, the user enters single-letter amino acid sequences for light and heavy antibody chains. Common modifications for heavy-chain N-terminal cyclization and C-terminal lysine clipping are included as check boxes. Further, the chemical formula/elemental composition can be added/subtracted from these protein chains through the respective Chem Mod text box. This allows the user the flexibility to add an elemental composition that includes multiple post-translational modifications or a small-molecule payload in the case of an ADC. As most therapeutic mAbs are engineered to remove the glycosylation sites in the light chain, glycosylation in the light chain is left optional and can be specified using a check box on the GUI.
A typical variation on intact mass analysis for antibodies is a reduced subunit mass analysis, where the light chain is detached from the heavy chain by reducing the interchain disulfide bonds. Depending on the strength of the reducing agent used, the intrachain disulfide bonds may or may not be cleaved. The users have the flexibility of entering the total number of disulfide bonds depending on the IgG subtype or in case of a cysteine-conjugated ADC9.
The application calculates masses in a bottom-up manner, in which the elemental compositions are first calculated for the individual heavy chains and light chains. Next, heavy chain (HC) N-terminal cyclization Lys-clipping is accounted for by adjusting the calculated elemental compositions. Any specified chemical modifications are then applied to the heavy and/or light chains. Depending on the type of analysis and the disulfide-bond patterns specified by the user, the number of hydrogens is adjusted for the two polypeptide chains. The glycosylated HC and light chain (LC) (optional) masses are calculated based on the user's input. Finally, multiple HC and LC masses are combined, and the disulfide bond numbers are automatically updated for the intact mass calculation.
With larger molecules such as intact proteins, monoisotopic masses cannot be measured due to the additive mass defect when using mass spectrometers with typical resolving power. Instead, nominal or average masses are measured or reported5,10,11,12,13. The average elemental masses can vary based on the source used for the curated masses14,15. While the differences in elemental masses may be small, they can add up to significant values for large-molecule molecular weight calculations. The average elemental masses used by default in the software application are shown in Supplementary Table 1. For regulated environments like the biopharmaceutical research and development (R&D) field, it is important to maintain consistent molecular masses because changes in masses may imply changes to the molecular entity during regulatory filings. To enable consistency in the use of elemental masses, a dictionary of elemental masses is included with the software tool as a comma-separated value (csv) text file: Element_Mass.csv (Supplementary Coding File 1). Similarly, a curated list of glycan compositions typically seen on mAbs is included: Glycan.csv (Supplementary Coding File 2). Both files are saved in the same folder location as an executable application and can be modified by the user to use a specific elemental mass list or glycan library.
Figure 1: GUI interface for the mAbScale application. The GUI module allows the user to specify the input for the mass calculation. The user enters single-letter amino acid sequences for the light and heavy antibody chains. Common modifications for the heavy-chain N-terminal cyclization and C-terminal lysine clipping are included as check boxes. Chemical formulas/elemental compositions can be added/subtracted through the respective Chem Mod text box. Please click here to view a larger version of this figure.
The high-level workflow for mAbScale is shown in Figure 2. Each step has more sophisticated inner decision branches, loops, and combinatorics. A detailed algorithmic workflow describing the calculation process is presented in Supplementary Figure 1. The application output is saved in a spreadsheet format in the user-selected folder. The output file consists of multiple separate worksheets, which can be categorized as the user input, molecular weight calculations, and references for the average isotopic mass derivations (example output is provided in supplemental tables). The user input worksheets include the protein amino acid sequences and other information entered by the user, averaged elemental masses, and glycan masses, which are used to calculate the elemental composition and different molecular weights. The molecular weight calculation sheets include the chemical composition of various forms, the reduced mass with and without glycosylation and chemical modification, and the intact mass with and without glycosylation and chemical modification. Sheets containing half-antibody masses will be generated automatically if the user enters two different HCs and/or two different LCs in the user input page, since half-antibodies are primary impurities that need to be identified and quantified relative to the desired heterodimer. The source code for mAbScale can be accessed through the following repository: https://github.com/kkhatri99/mAbScale.
Figure 2: Overview of the steps involved in the calculation of elemental compositions and masses using the application. Color coding can be used to link to the process flow described in Supplementary Figure 1. Please click here to view a larger version of this figure.
1. Opening the mAbscale application
2. Sequence entry
3. Specifying the number of disulfide bonds
4. Setting the output folder and running the application
A variety of mAbs were selected to represent different types of mAbs. A commercially available mAb standard was selected to represent a conventional mAb with identical heavy chains, identical light chains, and one N-linked glycosylation site in the Fc region. A mAb with an additional light chain N-linked glycosylation, a bispecific mAb, and an antibody-drug conjugate (ADC) mAb were also chosen to widen the application usage. The chemical composition, calculated mass, measured mass, and mass error of these example mA...
mAbScale provides an intuitive user interface with the flexibility to alter the building blocks for mass and elemental calculations. The users are expected to have a basic understanding of the target molecule to use the application, derive correct masses, and interpret the results. For example, the intact or reduced mass output sheet can be overwhelming due to the numerous rows of intact or reduced masses, since the default glycan database contains 88 N-linked glycans that are commonly found in the Fc portion of therapeu...
This software is being released under the Apache 2.0 license. Copyright (2022) of GlaxoSmithKline Research & Development Limited. All rights reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "as is" basis, without warranties or conditions of any kind, either expressed or implied. See the License for the specific language governing permissions and limitations under the License. L.C. is a GlaxoSmithKline (GSK) employee. T.H. and K.K. developed this software as employees of GSK and are now associates of Merck and Moderna, respectively.
The authors thank Robert Schuster for assistance with data verification.
Name | Company | Catalog Number | Comments |
Acquity UPLC system | Waters Corp., Milford, MA | N/A | Modular system |
Antibody-drug conjugate (ADC) | GlaxoSmithKline | N/A | Proprietory molecule |
BEH 200 SEC column | Waters Corp., Milford, MA | 176003904 | |
Bispecific mAb | GlaxoSmithKline | N/A | Proprietory molecule |
Byos | Protein Metrics, Cupertino, CA | https://proteinmetrics.com/byos/ Version 4.5 | |
GPMAW | GPMAW | http://www.gpmaw.com/ | |
LC-MS grade water | Thermo Fisher Scientific, Waltham, MA | W6-1 | |
mAb standard | Waters Corp., Milford, MA | 186009125 | Waters Humanized mAb Mass Check Standard |
mAbScale | GlaxoSmithKline | Apache License, Version 2.0 | |
Xevo G2 Q-TOF mass spectrometer | Waters Corp., Milford, MA | N/A | Modular system |
Request permission to reuse the text or figures of this JoVE article
Request PermissionExplore More Articles
This article has been published
Video Coming Soon
ISSN 2578-2037
Copyright © 2025 MyJoVE Corporation. All rights reserved
We use cookies to enhance your experience on our website.
By continuing to use our website or clicking “Continue”, you are agreeing to accept our cookies.