Higher-order structural characterisation of native proteins and complexes by top-down mass spectrometry

In biology, it can be argued that if the genome contains the script for a cell's life cycle, then the proteome constitutes an ensemble cast of actors that brings these instructions to life. Their interactions with each other, co-factors, ligands, substrates, and so on, are key to understanding nearly any biological process. Mass spectrometry is well established as the method of choice to determine protein primary structure and location of post-translational modifications. In recent years, top-down fragmentation of intact proteins has been increasingly combined with ionisation of noncovalent assemblies under non-denaturing conditions, i.e., native mass spectrometry. Sequence, post-translational modifications, ligand/metal binding, protein folding, and complex stoichiometry can thus all be probed directly. Here, we review recent developments in this new and exciting field of research. While this work is written primarily from a mass spectrometry perspective, it is targeted to all bioanalytical scientists who are interested in applying these methods to their own biochemistry and chemical biology research.

Higher-order structural characterisation of native proteins and complexes by top-down mass spectrometry Mowei Zhou, a Carter Lantz, b Kyle A. Brown, c Ying Ge, cd Ljiljana Paša-Tolić, a Joseph A. Loo b and Frederik Lermyte * efg In biology, it can be argued that if the genome contains the script for a cell's life cycle, then the proteome constitutes an ensemble cast of actors that brings these instructions to life. Their interactions with each other, co-factors, ligands, substrates, and so on, are key to understanding nearly any biological process. Mass spectrometry is well established as the method of choice to determine protein primary structure and location of post-translational modifications. In recent years, top-down fragmentation of intact proteins has been increasingly combined with ionisation of noncovalent assemblies under nondenaturing conditions, i.e., native mass spectrometry. Sequence, post-translational modifications, ligand/ metal binding, protein folding, and complex stoichiometry can thus all be probed directly. Here, we review recent developments in this new and exciting field of research. While this work is written primarily from a mass spectrometry perspective, it is targeted to all bioanalytical scientists who are interested in applying these methods to their own biochemistry and chemical biology research.
Mowei Zhou is a scientist at the Environmental Molecular Sciences Laboratory (EMSL), located at Pacic Northwest National Laboratory. He obtained a chemistry BS degree from Wuhan University, China. He pursued an analytical chemistry PhD under the supervision of Prof. Vicki Wysocki (University of Arizona, later transferred to Ohio State University), with a focus on implementing surfaceinduced dissociation into commercial mass spectrometers for protein quaternary structure characterisation. Before joining EMSL, he did a postdoc at the U.S. Food and Drug Administration. His main research interest is the development of novel MS techniques to understand proteins with unknown functions.
Carter Lantz is a graduate student at the University of California Los Angeles. He received a B.A. in University Scholars from Baylor University in the spring of 2017. He started his graduate studies at UCLA in the fall of 2017. He is currently a student in Professor Joseph Loo's lab where he utilizes native top-down MS and IM-MS to determine how posttranslational modications and small molecule inhibitors affect the structure of amyloid proteins such as tau and a-synuclein.

Introduction and historical perspective
Proteins are the main effectors of biological change; therefore, it is critical to assign their function and dysfunction in cells. In the early 2000s, the human genome was sequenced, leading to the identication of $20 000 genes, which might seem like a relatively low number when considering our biological complexity. 1 The post-genomic era has focused on understanding the downstream diversity that arises as DNA is transcribed into mRNA and then translated into proteins.
At the DNA level, mutations and single-nucleotide polymorphisms represent a substantial source of variation. 2 During the process of transcription, about 93% of human genes undergo alternative splicing, resulting in variations at the mRNA level. 2 Further diversity can be introduced aer mRNA is translated, with various post-translational modications (PTMs) occurring to the proteins. 2 PTMs are not directly encoded in the genome, and many of them are dynamically regulated in response to environmental stress. Combined, DNA, mRNA, and protein-level variations give rise to a diverse set of molecular forms that derive from a single gene. In 2013, a single term, 'proteoform', 1,3 was adopted to clarify the nomenclature surrounding protein complexity and to promote basic and clinical research efforts towards developing technologies for proteoform characterisation. 1 The total number of human proteoforms has been estimated to be in the hundreds of thousands or even millions. 2 Mass spectrometry (MS) has emerged as the most versatile and comprehensive method for proteoform characterisation. 4 Proteins form various noncovalent complexes to perform their biological functions, further complicating the proteome landscape beyond what has been traditionally understood under the proteoform concept. Although MS has mainly been used to obtain information regarding protein sequence, it has been increasingly utilized to understand higher-order structure. 5 Characterisation of large biomolecules by MS was made possible by so ionisation techniques such as electrospray ionisation (ESI) 6 and matrix-assisted laser desorption/ ionisation (MALDI) 7,8 developed in the late 1980s. By the year 2000, these innovations, ESI-MS in particular, were used to analyse biomolecules with molecular weights up to 1 MDa. Today, nanoESI 9,10in which the ow rate and droplet size are drastically reduced, resulting in far less sample consumptionas well as many other ionisation techniques, are coupled to many different MS instrument platforms for a broad range of applications. 5,11

Top-down MS (TDMS) as a powerful tool for comprehensive protein characterisation
In the conventional bottom-up protein analysis approach, proteins are extracted, chemically or enzymatically digested, separated by liquid chromatography (LC), ionised via ESI, and analysed by MS, allowing identication, quantication, and PTM characterisation for many thousands of proteins. Information regarding protein isoforms, PTM stoichiometry, and combinatorial PTMs is, however, lost when using peptides as protein surrogates. 12,13 The bottom-up approach has also been applied for higher-order structural characterisation using methods such as limited proteolysis, 14 chemical crosslinking, 15 and protein footprinting. 16 'Top-down' mass spectrometry, which forgoes the digestion step, has proven to be the premier MS-based technology for unambiguous proteoform characterisation, enabling in-depth sequencing, the discovery of novel proteoforms, and quantication of disease-associated PTMs. 13 While some technical Table 1 Brief explanation of some key terminology and techniques used in top-down and native mass spectrometry   Term  Meaning   TDMS  Top-down mass spectrometry; tandem MS of intact protein ions, with no enzymatic or chemical digestion step  TDP  Top-down proteomics; large-scale application of TDMS to (potentially) all proteins present in a cell, tissue, or  organism, usually with the goal of understanding biological processes and gene expression control  CID/CAD Collision-induced/collisionally activated dissociation; increasing the internal energy of ions by collisions with inert background gas molecules, a process in which energy is converted from translational to vibrational modes, resulting in dissociation of noncovalent and/or covalent bonds HCD Higher-energy collisional dissociation; used in Orbitrap instruments to distinguish 'beam-type' collisional activation in non-trapping multipoles from activation by resonant excitation in ion traps. Both processes involve collisions with background gas and generate qualitatively similar spectra. Although direct comparison of energy parameters is not trivial due to different instrument designs, HCD generally accesses higher-energy fragmentation pathways ECD Electron capture dissociation; fragmentation method for cations, based on gas-phase radical chemistry, in which a hydrogen-rich radical is formed by capture of (typically) a single low-energy (1-3 eV) electron by a biomolecular (typically a protein or peptide) cation ETD Electron transfer dissociation; similar to ECD, but the electron originates from a radical anion rather than an electron beam from a cathode emitter EID Electron ionisation dissociation; excitation of cations by fast electrons with energy at least 10 eV higher than the ionisation threshold of the cations ExD A general term referring to electron based activation, including ECD, ETD, and EID UVPD Ultraviolet photodissociation; method in which fragmentation is initiated by capture of (typically) a single ultraviolet (10-400 nm) photon. The exact mechanism depends on the photon wavelength, as described in the main text IRMPD Infrared multiphoton photodissociation; method in which fragmentation is initiated by capture of many infrared (780 nm to 1 mm; typically ca. 10 mm is used in practice) photons, leading to a gradual increase in internal energy and similar fragmentation behaviour to CID SID Surface-induced dissociation; method for ion activation/fragmentation based on accelerating an ion and colliding it with a surface within the mass spectrometer Native MS Native mass spectrometry; analysis by MS of biomolecules (primarily proteins) from non-denaturing solutions and using low-energy conditions in the source of the mass spectrometer, with the aim of preserving the higher-order structure in the gas phase Native TD Native top-down; gas-phase fragmentation of covalent bonds in an intact biomolecule or complex in a conformationsensitive manner, so that information about higher-order structure can be inferred from the fragmentation pattern nECD/nETD Native electron capture/transfer dissociation; use of these two electron-based fragmentation methods for native TD mass spectrometry Complex-up MS The process of using ion activation to eject one or more monomers or ligands from a biomolecular complex without inducing signicant cleavage of covalent bonds, so that, depending on the activation method used, monomer/ligand mass and/or subunit connectivity can be determined from the ejected species Complex-down MS The process of using ion activation to eject a monomer or ligand from a biomolecular complex, while inducing signicant cleavage of covalent bonds (either in a single step with ejection or in separate stages), so that sequence or structural information on the ejected species can be obtained challenges remain, developments in top-down protein analysis over the past few years have progressed its capability to unambiguously identify, characterise, and quantify thousands of proteoforms with high throughput. Recent developments in the growth, development, and applications in biomedical research of top-down protein MS are covered in several recent reviews. 5,12,13,17 Simultaneously, another MS-based technology that has enabled key new biological insights is known as native MS. Native MS aims to preserve the solution structure of proteins and protein complexes during the transfer to the gas phase. Some of the key background terminology, abbreviations, and techniques relevant to the rapidly evolving elds of native and top-down MS are listed in Table 1.

Native mass spectrometry of protein complexes
For nearly thirty years, ongoing efforts have endeavoured to use MS to understand noncovalent protein complexes by transferring them into the gas phase without loss of higher-order structure. 5,18 At its most basic level, native MS can return information on the makeup of protein complexes by providing molecular weights more accurately than conventional biophysical methods (e.g., size-exclusion chromatography (SEC) or analytical ultracentrifugation) especially for heterogeneous samples. In relatively simple cases, this can suffice to indicate the number of monomers in a complex and determine differences in complex makeup. [19][20][21][22] In addition to mass alone, the 3D shape of proteins and complexes can be simultaneously investigated by ion mobility-mass spectrometry (IM-MS). 23 Previous studies have shown that IM-MS correlates with known protein structure, 24 and that binding of ligands, cofactors, or metal ions can affect the observed structure, with important implications for e.g., drug discovery. 25 In addition, collision-induced unfolding (CIU), i.e., increasing the internal energy of an ion prior to IM-MS analysis, allows concomitant study of conformational stability, providing more details than IM alone. 26 While it is now commonly accepted that the stoichiometry of noncovalent protein complexes can be determined using native MS, the extent to which protein solution folding is retained, i.e., the degree of 'nativeness' of the gas-phase ions, is more controversial. Ion mobility experiments show that the overall 3D shape of proteinsespecially in the lower charge states naturally generated by ESI from non-denaturing solutionsis usually consistent with structures obtained from X-ray diffraction (XRD) and nuclear magnetic resonance (NMR) (although exceptions exist). 24 Additional compelling evidence has been provided by 'so-landing' experiments, in which gas-phase ions of large protein complexes are mass-selected and then gently decelerated and collected on a grid. Subsequent electron microscopy (EM) imaging then demonstrated preservation of native-like structures throughout the process of ionisation, dehydration, and so-landing. 27,28 Other analyses have further indicated that the overall structure of proteins is generally conserved in the gas phase. Using electron capture dissociation (ECD), McLafferty and coworkers have argued for refolding of small proteins in the gas phase to a non-native secondary and tertiary structure. 29 Conversely, using electron transfer dissociation (ETD), Vachet and co-workers found that the gas-phase salt bridge pattern of small proteins was more consistent with the pattern present in the known native structure than any non-native alternatives, 30,31 and gas-phase infrared spectroscopy carried out by von Helden and co-workers showed results consistent with preservation of alpha-helices and beta-sheets in native MS of myoglobin and beta-lactoglobulin, respectively. 32 All of this supports the idea that, while signicant expertise and care are needed, gas-phase structure may in fact reect important aspects of native solution structure.
As ions formed under non-denaturing ESI conditions typically have low charge states, native MS instruments must be able to transmit and detect high-m/z ions. Today, this is possible using commercially available time-of-ight, 33 Orbitrap, 34 and Fourier transform ion cyclotron resonance (FTICR) 35 instruments. These instruments can transmit large protein complexes without providing excessive activation that could compromise protein complex structure. In recent years, important progress in native purication methods such as native gel-eluted liquid fraction entrapment electrophoresis (GeLFrEE) separation, 36 native gel electrophoresis, 37 ion exchange chromatography (IEX), 38 hydrophobic interaction chromatography (HIC), 39,40 and online buffer exchange, has made these methods more applicable for complex mixtures and for characterisation of endogenous ligands. 36,41 By using appropriate sample preparation, ionisation conditions, and instrumentation, many challenging analytes such as membrane proteins, 18,42 intrinsically disordered proteins, 43 highly dynamic or heterogeneous complexes, 19 or very large systems such as intact virus capsids 44 can all be investigated using native MS, as well as their associated proteoforms. 45 Many of these are highly challenging to study by other analytical techniques such as NMR or XRD, and therefore the insights from MS can be vital to understanding these proteins and complexes. It is worth noting here that, while lower-resolution, MS-based methods are not necessarily less native than conventional methodsthe crystalline state is far removed indeed from the native protein environment. These classical methods are also more prone than MS to sampling a single lowenergy state or an ensemble average, and a combination of biophysical approaches can be needed to capture the dynamic nature of protein conformation. One aspect of protein structure to which native MS can be expected to be extended in the coming years is the study of protein quinary structure, which is dened by specic interactions in the crowded cellular environment that are weaker and more transient than those responsible for quaternary structure, and has recently been successfully investigated with non-native MS methods such as chemical crosslinking. 46,47 Continued advances in sample processing and instrument development are expected to make these new experiments more routine for protein complex analysis in the near future. As will be discussed in the rest of this review, native ionisation has in recent years been increasingly combined with top-down protein fragmentation, allowing probing of different structural levels and relating sequence information to higher-order structure and complex formation.
2. Gas-phase activation of intact, native proteins and complexes 2.1 Activation of protein complexes without backbone cleavage for quaternary structure ('complex-up' methodology) 2.1.1 Native MS alone provides limited information for heterogeneous complexes. The complex-up strategy aims at subunit dissociation of noncovalent complexes without cleaving covalent bonds. 17 Native MS without breaking up the complexes only provides limited information regarding quaternary structure and is largely blind to subunit connectivity and location of ligand binding within the complexes. For unknown complexes, intact mass alone is not enough for determining stoichiometry and composition. In addition, gentle tuning conditions used to maintain structural integrity of noncovalent complexes can result in insufficient desolvation, peak broadening, and increased uncertainty in mass determination. For fragile complexes, the result of this is that the achievable mass resolution may be too low for precisely dening the binding of small ligands. Recent publications by the Kelleher 48 and Heck 49 laboratories have demonstrated the use of charge detection MS of noncovalent complexes on Orbitrap instruments. In these experiments, small numbers of ions were allowed in the trap, allowing highly repeatable mass measurement of individual ions. A histogram of the singleparticle centroid masses constructed aer thousands of these measurements provides signicantly (approximately an order of magnitude) higher resolving power than conventional Orbitrap MS. Still, heterocomplexes that have multiple subunits with very similar masses are difficult to characterise just from the intact mass, especially when there is high uncertainty in mass measurement. 50 Another limitation of mass measurement alone of intact proteins and complexes is that their (average) masses might shi slightly due to natural variations in isotopic abundance, an effect which is able to cause mass shis greater than the accuracy of modern high-end mass spectrometers. 51 Solution disruption via addition of chemical denaturants has been used to partially dissociate native protein complexes into subcomplexes, including successful applications to RNA polymerase and exosomes. 52,53 This technique enables a simple way to access the subunit connectivity without changing the downstream native MS detection method. Because the dissociation occurs in solution, this method may not be easily applicable to highly heterogenous samples, as released subcomplexes cannot be tracked to their originating precursors. Furthermore, the protocol for partial denaturation requires optimisation for each complex and can fail for proteins that are resistant to mild denaturants or precipitate easily upon denaturation. Other solution-phase methods exist to study higherorder protein structure by subsequent MS analysis, for example chemical crosslinking, 54 protein footprinting methods including fast photochemical oxidation of proteins (FPOP), 16 and hydrogen-deuterium exchange; 55-58 however, these are beyond the scope of this perspective. Integration of information from different native and non-native techniques can provide valuable structural insights. 59,60 2.1.2 Collisional activation induces protein unfolding and subunit release. The term 'complex-up' was coined in 2019; 17 however, early examples of subunit release via activation of protein complexes in the gas phase were reported in 1994 by Smith and co-workers. 61 Under harsh source conditions, several model tetrameric complexes were dissociated into monomers and trimers. This was surprising because the trimers were not known to be physiologically relevant. This dissociation pattern of monomer stripping appeared to be ubiquitous in several early studies using gas collision to activate the complexes (known as collision-induced dissociation, CID, collisionally activated dissociation (CAD), or higher-energy collisional dissociation, HCD in some instruments), and was also seen for blackbody infrared radiative dissociation (BIRD). 62 Essentially, a monomer in the complex was stripped from the complex, leaving behind the stripped (n À 1)-mer (n is the number of subunits in the precursor complex). 61,63,64 The stripped monomers carry away a disproportionate fraction of the total charge relative to their mass. This seemingly odd pattern was described as 'asymmetric' dissociation and was studied in detail by several follow-up reports. 62,65 Accumulating experimental and computational studies have suggested that charge plays an important role in gas-phase protein unfolding and dissociation. 62,[65][66][67] The mechanism of charge migration to the ejected monomer is not fully understood, but mobile charge 62,68,69 and salt bridge rearrangement theories 70,71 have been proposed.
CID has been used to release subunits from protein complexes for conrming complex composition. For example, the ubiquitous monomer-stripping pattern was used to activate aB-crystallin complexes with polydisperse stoichiometry (primarily 24-33 mer). 72 As larger oligomers carry more charge in native MS, the signals for all these oligomers end up as an overlapping, unresolvable cluster around m/z 10 000. The released (n À 1)-mers, (n À 2)-mers, and (n À 3)-mers from sequential monomer stripping could, however, be mass resolved and from these, the stoichiometry of the intact complexes was inferred. 72 Although CID can be used to identify the composition of unknown complexes, the dissociation may be incomplete and insufficient to release all subunits of multimeric hetero-complexes. 50,73 Typically, subunits at the periphery are preferentially ejected in CID. [74][75][76] Because of the signicant unfolding and the ubiquitous monomer stripping dissociation pattern, extracting subunit connectivity and architecture from CID data is usually not straightforward.
2.1.3 Surface-induced dissociation reveals subunit connectivity and ligand binding. Surface-induced dissociation (SID), in which proteins collide with a surface target, can produce folded subunits with minimal structural rearrangement for a number of model protein complexes. 77 SID is more efficient than CID in converting kinetic energy of an ion to internal energy because of the larger mass of a surface target compared to neutral gas molecules. Refractory protein complexes (typically those with strong charge-charge interactions including protein-RNA/DNA complexes) are difficult to dissect by CID but can be dissociated by SID. 78,79 In addition, the activation in SID occurs on a much shorter time scale than (low-energy) CID, in which protein ions undergo many lower-energy collisions (Fig. 1). This rapid activation in SID allows protein complexes to be dissected into subcomplexes prior to signicant structural rearrangement. 77,80,81 The released subcomplexes therefore provide information on the connectivity of subunits in the precursor. 82,83 For example, the streptavidin tetramer preferentially dissociates into dimers in SID, which is representative of the native 'dimer-of-dimers' structure of the complex. SID has been shown to be helpful for mapping the topology of designed heterocomplexes, 84,85 dissecting the assembly mechanism of transthyretin, 86 and the structural characterisation of 20S proteasome orthologue complexes from different species. 22 SID was also used to localise noncovalent ligands in multimeric complexes. 50,87,88 However, care must be taken to minimize structural rearrangement before SID. Harsh conditions in the source or transfer optics for achieving the best mass resolution could over-activate the complexes and change their shape signicantly, which is measurable by ion mobility. Over-activated complexes will generate different SID spectra from their original structures. 89 Systematic examination of model complexes showed that the SID collision energy required to cleave a given interface is correlated with the interface strength calculated from the (known) structure of the complex. 82 Weaker interfaces will therefore be cleaved at lower collision energy than stronger interfaces in SID. The structurally informative dissociation by SID enables quaternary structure characterisation of unknown proteins that are recalcitrant to classical structural biology techniques. Toyocamycin nitrile hydratase (TNH) and bacterial biominerialisation enzyme Mnx are two heterocomplexes that resist crystallization. Their mass (86 kDa and 210 kDa, Fig. 1 Schematic representations of CID and SID of noncovalent protein complexes. A hypothetical potential energy diagram is shown in the inset (reaction coordinate on x axis, potential energy on y axis, arbitrary energy scale). In CID (on the right), protein complexes undergo many steps of collisions, resulting in structural rearrangement, unfolding, and monomer ejection. Rapid activation in SID (left) allows direct dissociation into folded subunits. Adapted with permission from ref. 90. Copyright 2018 American Chemical Society. Chemical Science respectively) also puts these complexes in a range that is too large to be easily studied by NMR, but too small for cryo-EM. SID experiments of these complexes were quick, with data acquisition typically on the order of minutes to hours, and provided critical information to dene their quaternary structures. 50,73,88,91 Fig. 2a illustrates how the data from complex-up MS were recently used to study the previously uncharacterised heterocomplex Mnx, which consists of three proteins: MnxE (12.2 kDa), MnxF (11.2 kDa), and MnxG (138 kDa). MnxG is homologous to multicopper oxidase, a monomeric enzyme. MnxE and MnxF have no known homologues or functions, but are essential for the stability of the Mnx complex. Because of the similar mass of MnxE and MnxF, the stoichiometry could not be condently assigned from size-exclusion chromatography, ordue to the peak-broadening effects discussed previouslyeven native MS alone. Aer mass-isolation of the Mnx complex, SID dissected Mnx into MnxE 3 F 3 hexamer and MnxG at low collision energy (Fig. 2b), suggesting the complex stoichiometry to be MnxE 3 F 3 G. With increased collision energy in SID, MnxE 3 F 3 further dissociated into subcomplexes following a similar pattern to other symmetric ring complexes (Fig. 2c). The SID data were used to map the subunit connectivity and architecture, allowing a structural model to be built (Fig. 2a). The resolution of the model can be improved to an all-atom level via computational tools, constrained by additional experimental data such as collisional cross section from ion mobility and solvent exposed surface area determined by footprinting techniques as described in Section 2.1.1. In contrast, CID of Mnx showed almost exclusively monomer stripping (MnxE and MnxF), providing limited information for mapping the assembly (Fig. 2d). Notably, MnxE and MnxF monomers released from Mnx by SID showed different Cu-binding stoichiometry (Fig. 2e). MnxE strongly binds to one Cu mostly, while MnxF can weakly bind multiple Cu atoms. The different binding behaviour between MnxE and MnxF revealed by complex-up experiments suggest that the two unknown proteins presumably have different functions. Unlike SID, CID experiments were not able to faithfully capture the metalbinding properties of MnxF, as, during monomer unfolding, the weakly bound Cu in this subunit was lost. 50,88 Recently, SID was also applied to characterise the subunit arrangement of a plant pseudoenzyme-enzyme hetero-complex between PDX1.2 and PDX1.3 in Arabidopsis. 92 The pseudoenzyme PDX1.2 lost its activity due to mutation of a few key residues at the active site but is nearly identical structurally to the active enzyme PDX1.3. The two proteins form heterododecamers with varying stoichiometry. Both XRD and cryo-EM suffered from the statistical disorder and were not able to distinguish the two types of subunits in the hetero-complexes because of their highly similar shapes and the heterogeneity in stoichiometry. 92 However, their different masses can be readily differentiated by MS. SID of the isolated heterododecamers also revealed the symmetry of the subunits within the complex and shed light on the mechanism of the hetero-association. All these examples show that SID is effective for quaternary structure study following the 'complex-up' strategy. Such experiments complement other structural biology techniques, especially for heterogenous complexes that are difficult to resolve by any single technique. So far, one major factor that has limited the use of SID in practice is the more limited availability of this technique compared to other ion activation methods, although recent work by Wysocki and coworkers has simplied the design and operation of SID. 93 The new design was successfully incorporated into several commonly used instrument models for native MS. The rst commercially available SID-enabled instrument was recently announced (SELECT SERIES Cyclic IMS; Waters Corporation, Milford, MA, USA) and others are expected to follow in the near future.
2.1.4 Other activation methods and important factors for complex-up MS. Other than CID and SID, photo-activation by ultraviolet (UVPD) and infrared multiphoton photodissociation (IRMPD) have also been used in complex-up experiments. While exceptions have been reported, 94 electron-based activation generally does not cause signicant disruption of higher-order structure 35 and is thus ineffective for complex-up. Intramolecular energy redistribution aer conversion of photon energy to vibrational modes is thought to be responsible for breaking of noncovalent interactions in UVPD. 95 IRMPD of protein complexes, on the other hand, has produced similar results to CID, likely because of the low energy of infrared photons, of which dozens or even hundreds are absorbed to cause dissociation. 96 In contrast, 193 nm UVPD of several model protein complexes showed CID-like asymmetric dissociation at low pulse energy, but changed to more symmetric dissociation (SID-like) at higher pulse energy. 97,98 This change of dissociation behaviour as a function of input energy in UVPD is reminiscent of the mechanistic difference between the slow heating in CID and rapid heating in SID. Even though SID has been shown to induce dissociation of complexes with minimal unfolding, the monomer stripping pathway indicative of unfolding can also be observed in SID, especially for large protein complexes. Previously, a 'shattering' mechanism (i.e., prompt fragmentation/ dissociation distinct from slow collisional and thermal activation) was proposed for SID of peptides, 99 but dissociation may occur at a much slower rate aer activation for larger molecules with high degrees of freedom and signicant intramolecular energy redistribution/relaxation. SID of large protein complexes could also result in multiple, inelastic collisions like those seen for cluster ions. 100 The energy deposition could be affected depending on how the protein ions interact with the surface. 101 In addition to the differences in activation techniques, the charge of protein complexes is another important factor in complex-up experiments. 66,67,98,102 Charge reduction through solution additives and gas-phase reactions can supress protein unfolding, and generally help SID by increasing the percentage of structurally informative products. 66 In contrast, subunit dissociation is suppressed in CID for charge-reduced precursors. Instead, CID tends to benet from supercharging, allowing more SID-like dissociation. 67 Computational and theoretical studies have indicated that charge can move upon activation, as further discussed in Section 2.3. The energy landscape of protein complexes in the gas phase is thus strongly affected by charge, resulting in different behaviours with different activation methods. When the goal is to study quaternary structure with a complex-up strategy, conditions should therefore be optimised to minimize unfolding and structural rearrangement.
2.2 Sequencing of ejected proteins from native complexes ('complex-down' approach) 2.2.1 MS n for sequence and stoichiometry of native protein complexes. While mass measurement of monomers and noncovalent (sub)complexes is informative, oentimes complementary sequence analysis is desired to identify the component proteoforms. There are multiple ways to obtain this information from intact proteins. Top-down MS under denaturing conditions yields important information on sequence and PTMs, but information regarding noncovalent interactions or protein folding is lost. Efficient sequencing can be combined with obtaining information on noncovalent complex stoichiometry through MS n (n ¼ 3 or more) analysis of complexes ionised under non-denaturing conditions. This type of experiment has been referred to as 'complex-down' 17 MS or recently as 'nativeomics'. 103 The extra layer of information afforded by fragmentation of covalent bonds compared to the methods described in Section 2.1 can facilitate a greater understanding of protein complex function.
The rst step of these experiments is to transfer a noncovalent complex from the solution phase to the gas phase without excessive disruption of its higher-order structure. 103 Next, the internal (vibrational) energy of the complex is increased sufficiently to induce ejection of monomers or noncovalent ligands without breaking covalent bonds. Normally, activation is provided by techniques such as CID or SID. CID can be implemented either aer a specic precursor m/z is isolated in the gas phase, or through elevated acceleration voltage of all ions in the source. In the latter case, the technique is oen referred to as in-source dissociation (ISD), which generally requires highly puried samples or online separation so that the products can be traced to the precursor unambiguously. As described in Section 2.1, the charge density of the ejected subunits is usually higher than that of the original complex, making them amenable to sequencing by several commonlyused ion activation methods. The ejected ionmonomer or ligandis subsequently (re-)isolated and fragmented in a (pseudo-)MS 3 or MS 4 workow, providing in-depth information on e.g., phosphorylation sites, 104 metal ion binding, 36 or sequencing of peptide ligands (see Fig. 3). 105 In 2013, Kelleher and co-workers performed complex-down analysis of the GroEL 14-mer (801 kDa). 106 It was found that GroEL monomers could be ejected from the complex in the source and further fragmented with HCD. Sequence-informative fragments could be readily observed in the low-m/z region of the mass spectrum with this workow. More recently, Heck and co-workers obtained sequence information on the Aquifex aeolicus lumazine synthase (AaLS) virus-like nanocontainer (>1 MDa) with UVPD fragmentation, resulting in a mix of monomer ejection and backbone fragmentation. 107 It was found that optimal collisional cooling and trapping before UVPD fragmentation allowed for efficient detection of intact virus nanocontainers, monomers, and sequence fragments. Combining knowledge of the intact mass of protein complexes from native MS with accurate mass measurement of the ejected, desolvated, highlycharged monomers (MS 2 ), and sequence information (MS 3 ) in this manner, allows for the determination of the exact proteoform composition of the complex.
2.2.2 Sequence elucidation and modication location analysis through complex-down MS. In addition to identifying the primary sequence of subunits, complex-down analysis can elucidate sequence abnormalities such as deletions and mutations in native protein complexes. These data can help characterise the structure and function of protein complexes. Sharon and co-workers found novel sequence information on the alpha subunit of the rat 20S proteasome with complex-down analysis. 22 Notably, they found N-terminal acetylation and removal of the last two amino acids on the C-terminus, complementing data that was absent from cryo-EM, which did not have enough resolution to discern these features. In another example of how complex-down analysis has been shown to aid in the identication of protein complex mutations by providing deep sequence information on ejected monomers, Compton and coworkers demonstrated how complex-down analysis of proteins with regulatory post-translational modications leads to more accurate structural and functional characterisation of those Chemical Science complexes. 36 Sequence analysis of the triosephosphate isomerase complex indicated the presence of a proteoform that was not modied, a proteoform that was phosphorylated at serine 20, and a proteoform that was N-terminally acetylated. The phosphorylated and acetylated proteoforms did not dimerise with themselves or each other, indicating that phosphorylation at serine 20 and N-terminal acetylation act to inhibit the dimerisation of the complex. 36 Sharon and co-workers have used complex-down analysis to determine how yeast cells use phosphorylation to regulate certain cellular pathways under different growth conditions. 104 It was found that the level of phosphorylation of the fructose-1,6-bisphosphatase 1 (FBP1) complex differed depending on whether the cells were grown on carbon-starved media, glucose media, or were heat-shocked. Complex-down analysis of the monomers indicated that phosphorylation on Ser12 or Thr13 was highly expressed in cells that were grown on glucose media. Phosphorylation at Ser12 is known to deactivate this complex, indicating that the complex is deactivated under these conditions and the cells readily switch from performing gluconeogenesis to glycolysis. In all these examples, complex-down analysis efficiently located key modications and/or sequence variants on native protein complexes, illuminating key structural and functional characteristics of those complexes.
2.2.3 Complex-down for characterisation of membrane proteins. Generally, complex-down analysis is performed on soluble protein complexes that are relatively easy to dissolve and spray in native MS buffers. Recently, this technique has been extended to membrane proteins by adding detergents or lipids in the solution to prevent protein precipitation and to preserve their native structure. Carefully tuned collisional activation is used to eject intact membrane protein complexes from detergent micelles so that the membrane protein can be efficiently analysed. 108 Increasing the level of this activation leads to the ejection of protein monomers, allowing complex-down sequencing (see Fig. 4). Recently, Robinson and co-workers applied this workow to show that MS n analysis was able to identify a lipid molecule that was bound to the outer mitochondrial membrane translocator protein complex. 103 In-source activation ejected the membrane protein from the detergent micelle encapsulating the protein; isolation and activation of the membrane protein complex via HCD ejected the lipid molecule, which was in turn isolated and subjected to HCD or CID, allowing it to be identied. The lipid was found to be a phosphatidylethanolamine 34:1, which t well with the existing crystal structure. In standard omics experiments, proteins and lipids were extracted separately and characterised by LC-MS; however, using the complex-down strategy, the association between protein and lipid could be directly identied in their native context, providing insight into their biological role.
Brodbelt and co-workers used UVPD for effective sequencing of aquaporin Z monomers ejected from the native tetramer. 109 The sequence coverage for monomers was increased by 21% in this way compared to direct native TD UVPD of the tetramer. This increase was presumably due to disrupted noncovalent interactions between subunits (see Section 2.3 for more on how these interactions affect the native TD fragmentation pattern). In some cases, complex-down analysis has been shown to map important regions of native proteins. Sobott and co-workers recently studied three membrane proteins, i.e., the pentameric mechanosensitive ion channel of large conductance (MscL), the tetrameric Kirbac potassium channel, and the hexameric hepatitis C p7 viroporin. Performing CID-based complex-down MS, they found that b and y fragment ions mainly stemmed from dissociation in the membrane-spanning regions of the monomers. 110 This was consistent with earlier work by Kelleher and co-workers, who performed top-down LC-MS of denatured integral membrane protein monomers, and found that transmembrane domains were more likely to fragment in collisional than electron-based dissociation. 111 2.2.4 High-throughput complex-down of heterogeneous protein samples. Recently, high-throughput native separation techniques including GELFrEE, 36 HIC, 39,40 SEC, 112 capillary zone electrophoresis (CZE), 112 and IEX 38 have efficiently aided in the analysis of heterogeneous protein mixtures. Kelleher and coworkers demonstrated the use of native GELFrEE coupled to MS for the characterisation of protein complexes from cobra venom. 113 Complex-down analysis enabled the characterisation of protein complexes, and in some cases allowed identication of glycosylated and metal-bound proteoforms. In a different example, Sun and co-workers showed that SEC and CZE can separate multiple proteins and protein complexes before analysis with complex-down mass spectrometry on a lysate of E. coli, resulting in the identication of 672 proteoforms and 23 protein complexes. 112 These examples illustrate how, by preserving the native structure of complexes, these highthroughput techniques can yield critical information regarding complex formation and ligand binding that is not easily accessible by conventional proteomics techniques. New developments in MS instrumentation, 113 spectrum deconvolution soware, 112 and fast peak identication soware 113 are expected to facilitate high-throughput complex-down studies becoming more routine in the future. Data analysis has so far been a particular bottleneck in these approaches, as native MS spectra oen do not have isotopic resolution, and so common soware for small molecule and peptide analysis cannot be used to convert m/z to intact mass. Spectral interpretation of native MS has largely relied on manual analysis, although deconvolution soware such as UniDec and iFAMS has been developed in recent years. 60,114 Likewise, analysis of top-down MS data has largely involved manual analysis or manual validation due to the complexity of the data. Data interpretation for complex-down spectra is similar to TDMS of denatured proteins; therefore, many existing soware packages for TDMS can be retrotted for complex-down data, including Pro-Sight PTM 2.0, 115 ProSight Lite, 116 MASH Suite, 117-119 Pro-teinGoggle, 120 TopPIC, 121 Informed-Proteomics, 122 Masstodon, 123 and others. Even then, some unique challenges unique to complex-down remain, including the need for multi-stage activation, poorly resolved precursor complexes, and low intensity peaks, and so manual analysis or validation still plays an important role in practice. An up-to-date overview of soware packages for analysis of top-down data can be found on the webpages of the Consortium for Top-Down Proteomics (https:// www.topdownproteomics.org) and an excellent review was recently published. 12 2.3 Conformation-sensitive fragmentation of native complexes for secondary and tertiary structure 2.3.1 Electron-and photon-based activation can directly probe gas-phase higher-order structure. The methods discussed in Section 2.2 rely on gas-phase fragmentation of the protein backbone, but only aer the higher-order structure of a protein complex has been largely annihilated. The 'native top-down' strategy also refers to fragmentation of the protein backbone, but fragmentation methods are directly applied to native proteins or complexes instead of unfolded proteins. Native TD generally results in less extensive fragmentation than denatured TD because (1) the lower charge states of native species in ESI have a negative impact on the efficiency of most fragmentation methods, and (2) the noncovalent interactions (salt bridges, hydrogen bonds, etc.) that characterise higher-order structure can protect parts of the protein from fragmentation and/or prevent the release of fragments. Due to the second factor, the lack of fragments in certain regions of the protein can inform on higher-order protein structure. 35 Electron-based dissociation (ExD) is useful for probing protein structure because it generates backbone fragments without annihilating the higher-order structure of the protein.
In ECD, protein ions capture low energy (1-3 eV) electrons, leading to the formation of c/z fragment ions. 124 Electron ionisation dissociation (EID) is an electron-based fragmentation method that uses higher-energy electrons to fragment proteins generating a/x and b/y ions in addition to c/z ions. 125 EID offers more extensive fragmentation than ECD and, as electron energy can be tuned through user-accessible instrument parameters, both experiments can be performed using the same instrumentation. In addition, it was recently reported that EID of proteins results in more internal fragments than ECD which could help with protein sequence coverage. 126 ETD is mechanistically similar to ECD, but uses a radical anion rather than a cathode to provide electrons. 127 As in ECD, the transfer of the low-energy electron typically results in c/z fragment ions. This journal is © The Royal Society of Chemistry 2020 Chem. Sci., 2020, 11, 12918-12936 | 12927

Perspective
Chemical Science Native TD can be performed using UVPD, and light sourcesusually laserswith different wavelengths have been reported for protein characterisation. Most of the reported native TD work has used 193 nm UVPD (ArF excimer laser), which yields a/ b/c/x/y/z ions from the mixed mode of both vibrational and electronic excitation. 95 In comparison, UVPD with 157 nm photons (F 2 excimer laser) appears to be more effective at generating a/x ions and has specicity towards disulphide bond cleavage. 128,129 UVPD at wavelengths of 213 nm 130 and 266 nm 131 has also been reported for native TD of small proteins. In the following paragraphs, we will consider how different aspects of higher-order structure can be probed using native TD fragmentation.
2.3.2 Native TDMS for secondary and tertiary structure. Fourier transform ion cyclotron resonance instruments can readily be used to perform ECD because low-energy electrons can be efficiently trapped by the static electromagnetic eld in the ICR cell. Early experiments used ECD to probe structural changes in ubiquitin 29 and cytochrome c 132 aer being transferred into the gas phase. It was found that these proteins show more fragmentation with an increase in source temperature and pre-ECD infrared activation. This indicated that these proteins start to unfold in the gas phase when their internal energy is increased by these 'slow' activation methods. These experiments on small monomeric proteins clearly demonstrated the feasibility of accessing secondary and tertiary structures using MS techniques and spurred research on larger proteins and protein complexes.
ExD techniques have been used to probe the secondary and tertiary structure of larger proteins and protein complexes in the gas phase. It has been reported that even though ExD techniques may fragment the backbone of the protein, interactions such as salt bridges 30 and disulphide bonds 133 can hold fragments together, preventing their release and detection in MS. This phenomenon is known as 'electron capture/transfer with no dissociation' (ECnoD or ETnoD, respectively). This principle can be used to probe secondary and tertiary structure. Barran and co-workers have shown that the stability of certain secondary structure elements of proteins can be probed by fragmenting different charges states. 134 It has also been found that the location of disulphide bonds can be mapped with ExD fragmentation. 133,135 ExD techniques have also been shown to probe tertiary structure on protein complexes such as alcohol dehydrogenase (ADH). ECD 136 and ETD 137 of native tetrameric yeast ADH both yielded primarily N-terminal fragments, while virtually no fragmentation in the C-terminal region was observed. For the C-terminus, this behaviour can be easily rationalised, as it is buried in the interior of the complex. To rationalise relative fragment abundances from the N-terminal region, explanations have been proposed based on local backbone exibility and surface exposure (see Fig. 5). [136][137][138] ExD fragmentation has also been utilized to probe the structure of other protein complexes such as haemoglobin 139 and glutamate dehydrogenase. 35 These examples clearly show that ExD can provide useful structural information about proteins and complexes.
Similar to electron-based activation, 193 nm UVPD has been shown to cleave preferentially at exible and exposed regions of proteins and protein complexes. However, 193 nm UVPD generally shows higher sequence coverage than ExD and can yield fragments even in more protected regions of proteins. 125,140,141 The component of vibrational excitation in 193 nm likely improved the release of fragments due to more efficient disruption of noncovalent interactions than in ExD. Nonetheless, high-intensity fragments are generally correlated with surface accessibility in UVPD, and shis in fragment intensities indicate conformational changes. [141][142][143] By quantifying the changes in fragmentation efficiency, subtle structural effects of metal-/ligand-binding can be probed with 193 nm UVPD. 144,145 A unique feature of 193 nm UVPD is that the a-type fragment ions are sensitive to protein secondary structure. Previous studies on proline-containing peptides showed that a + 1 and a + 2 ions (i.e., a-type ions carrying one or two additional hydrogen masses, respectively, relative to their canonical structure) can be detected (see Scheme 1). Upon homolytic cleavage in UVPD, odd-electron a + 1 ions are produced. These are thermodynamically unstable and can eliminate a hydrogen atom to form the commonly detected a ions. Amino acid structures and secondary structures both affect the lifetime and the detectability of a + 1 ions. [146][147][148] The rigid backbone of proline was believed to be responsible for the a + 2 ions, which are formed as the alpha-carbon abstracts a hydrogen atom from a nearby residue. [148][149][150] These behaviours have recently been shown to translate from the early peptide studies to model protein complexes. 141,146 The increased relative abundance of a + 1 to a ions was correlated with hydrogen bonding motifs in small monomeric proteins. 146 Turn structure in proteins was also suggested to be responsible for formation of a + 2 ions in the absence of proline. 141 Therefore, it is possible to extract such spectral features from UVPD experiments to obtain secondary structural details of protein complexes in the gas phase.
2.3.3 Ligand binding sites and subunit binding interfaces through native TD. Both ExD and UVPD can identify ligand binding sites in native TD through monitoring backbone fragments that retain the noncovalent ligand. Since these ligands are generally weakly bound to proteins and protein complexes, activation through CID tends to disrupt the ligand binding, causing information about the binding site to be lost. In contrast, ExD and UVPD can dissociate the peptide backbone while preserving the ligand. In the fragmentation spectrum, the larger apo fragments (i.e., long sections of the protein sequence without ligand) and the smaller holo fragments (short stretches with the ligand) can together dene the site (or region) of the bound ligand. 136,140,151 Using this analysis, it is possible to localise binding sites on proteins and complexes, providing insight into the function of ligands.
Loo and co-workers demonstrated for the rst time in 2006 that ECD could be used to localise noncovalently bound spermine on the amyloidogenic protein a-synuclein. 151 Soon aer that, Sadler and co-workers showed that binding sites of the drug cisplatin can be localised on peptides with ETD fragmentation. 152 Since then, ECD has been used to pinpoint NAD + on alcohol dehydrogenase 153 and small aggregation-inhibiting compounds on amyloid proteins. 154,155 Correlation of these binding sites with structural changes can suggest possible mechanisms of noncovalent ligand binding. 155 Several of these examples also illustrate the capability of native MS-based methods to probe intrinsically disordered proteins and their complexes. More recently, UVPD has been used to localise noncovalent ligands on protein complexes, such as haem on myoglobin, 143 NADPH and methotrexate on dihydrofolate reductase, 142 and GTP on eIF4E. 140 These examples clearly show that ExD and UVPD preserve noncovalent ligands when cleaving the peptide backbone and can readily localise these ligands on proteins and protein complexes.
Fragmenting protein/metal ion complexes can aid in the localisation of metal ions. Early fragmentation of peptide/metal complexes used CID to dissociate the peptide backbone. 156,157 In order to preserve the metal/peptide complex, low-energy CID was used in these studies; however, it remains a concern that the increase in internal energy could mobilise the metal cation sufficiently to migrate across the protein/peptide in the gas phase prior to backbone fragmentation. In some cases, weakly bound metal ions might be lost in CID, as discussed in Section 2.1.3. Since the development of ExD and photodissociation techniques, metal ions can be more reliably pinpointed on peptides and proteins. Metal ion binding sites have been identied on native peptide and protein monomers such as amyloid b, 158 a-synuclein, 159,160 and carbonic anhydrase 161 with ECD, as well as native protein complexes such as alcohol dehydrogenase. 153,162 Similarly, UVPD is also effective in determining metal binding sites as shown in model metalloproteins including staphylococcal nuclease, azurin, and calmodulin. 145 Brodbelt and co-workers have also used this method to localise zinc ions within the insulin pentamer. 140 Interestingly, in some of these studies it was shown that some CID fragments also retained the metal cation, and the pattern of apo and holo CID fragments was consistent with that from ECDthis was likely due to the 80-fold lower dielectric permittivity of vacuum compared to water strengthening the electrostatic proteinmetal interactions. For the same reason, metal binding can sometimes survive monomer ejection, and a complex-down approach (see Section 2.2) has also been successfully applied to identify binding sites of endogenous metal cofactors of both soluble 36 and membrane 163 proteins.
Taking this a step further, native TD can provide information on binding interfaces between biological macromolecules. In this way, native TDMS has proven useful for understanding the quaternary structure of protein complexes as well as 'lower' levels of higher-order structure as discussed in Section 2.3.2. In 2009, Woods and co-workers used both ECD and ETD to investigate the residues involved in binding between small (<10 residues) acidic and basic peptides. 164 Other pioneering work was performed by Langridge-Smith and co-workers in 2011, who used ECD to study the binding interface between the anterior gradient-2 protein and its hexapeptide ligand PTTIYY, concluding that binding involves the C-terminal part of the protein. 165 Schneeberger and Breuker have used CID to determine the binding site of RNA on proteins, again taking advantage of the strengthening in the gas phase of electrostatic interactions to the point where they are occasionally able to survive backbone fragmentation. 166 Recently, O'Connor and coworkers have shown that ECD fragmentation of oligomers of the amyloidogenic peptide amylin (implicated in type 2 diabetes) can provide information on the binding interface between monomers (see Fig. 6). 167 Specically, the observation of product ions consisting of an intact monomer noncovalently bound to either the three C-terminal residues (z 3 fragment), or the 29 N-terminal residues (c 29 fragment), led the authors to Chemical Science propose a model in which dimerisation occurs between Ser29 of the rst, and Asn35 of the second monomer, in a staggered fashion.
2.3.4 Native TD to study similarity and differences between solution-phase and gas-phase structures. As discussed in Section 1.2, there is an ongoing debate on how closely gas-phase structure reects that in solution. Native TD has offered critical experimental data for understanding the evolving protein structures in the gas phase. 141,168 Although some side chain rearrangements are expected from removal of solventfor example, positively charged lysine side chains will rapidly form interactions with backbone carbonyl groups, as this intramolecular solvation is energetically highly favourablethe overall fold of protein complexes is generally believed to be kinetically trapped on the time scale of the MS analysis. 5 However, inadvertent excessive gas-phase pre-activation of protein complexesbeyond what is needed for efficient desolvationcan lead to signicant structural rearrangements, which can be manifested by changes in native TD spectra. Therefore, it is necessary to optimise MS tuning to reduce the amount of activation applied to a protein or protein complex. With little activation it seems that overall protein structure is preserved in the gas phase and can be readily probed for structural characteristics.
Gross and co-workers demonstrated that native TD ECD fragments from the ADH tetramer reached deeper into the complex (starting from N-terminus into the core) as it unfolded with increasing activation in the ion source. 136 The experimental data suggested that ADH unfolds through a 'peeling an onion' mechanism in which the N-terminus gradually unravels. Conceptually similar experiments reporting ECD and ETD of partially-unfolded haemoglobin were carried out in 2015 by the Gross, 169 Loo, 170 and Sobott 139 groups, and recently ECD and ion mobility were performed on the same instrument, providing two orthogonal methods to probe unfolding of this tetramer. 171 Recently, the unfolding of ADH was re-examined by native TD using both ECD and 193 nm UVPD on the same instrument, further indicating that the N-terminus unravels with increasing collision energy (see Fig. 7). 141 Larger fragments reaching deeper into the core of the protein were detected for both ECD and 193 nm UVPD as higher in-source collision energy was used. Interestingly, subtle changes in spectral features indicated that unfolding was not a simple, gradual unravelling of N-terminal residues. The ECD fragments within the rst 50 residues decreased in intensity with increasing collision energy, implying protection from fragmentation in the region. UVPD data showed changes in a/a + 1 ion ratio in the rst 50 residues as well, suggesting secondary structural changes. In addition, charge movement was monitored by examining the charge sites based on charge states of UVPD a ions, as pioneered by Morrison and Brodbelt. 172 Charge density rst increased at the Nterminus, but then surprisingly decreased as the collision energy further increased. By combining all the available data, it was proposed that ADH underwent N-terminal unfolding followed by (partial) refolding.
The examples discussed here show the tremendous experimental detail that native TD offers for improving our understanding of gas-phase protein structure. Previous computational and theoretical studies have largely relied on complex-up experiments at the intact protein/subunit level, which do not provide information at the amino acid level. We anticipate that the integration of native TD and computational modelling will greatly enhance our understanding of critical factors that modulate gas-phase protein fragmentation and dissociation. The ability to perform gas-phase spectroscopy on intact proteins and complexes in MS is particularly exciting and allows a new level of structural information to be accessed. Action ion spectroscopy coupled to MS using free electron lasers with variable wavelengths is well-suited for structure analysis as demonstrated by work on small molecules, peptides, and oligonucleotides. This concept is based on the varying fragmentation efficiency of precursor ions in response to enhanced absorption at resonant photon energies. Infrared ion spectroscopy has already been demonstrated on small monomeric proteins. 32,173 Circular dichroism was recently combined with MS for oligonucleotides. 174 With ongoing development of advanced light sources coupled to native MS, 44,175 we expect the possibility of performing spectroscopy analysis of native proteins and complexes for deep structural characterisation to emerge in the near future. The ability to mass-isolate species by MS-based methods could potentially transform structural biology research by offering complementary methods to study non-homogenous samples (e.g., endogenous proteins isolated directly from biological matrices).

Conclusion and outlook
The biology of proteins must be considered through the lens of their ability to interact with one another, other biomolecules, and various cofactors and substrates. For this reason, complete knowledge of the primary structure, including post-translational modications, is a necessary but insufficient condition to understand protein function. Recent methodological developments have enabled direct probing of the higher-order structure of native proteins and complexes, including membrane proteins, by mass spectrometry. At the lowest level of information, the mass and overall size of a complex can be measured using ion mobilitymass spectrometry; however, by combining native ionisation with top-down fragmentation, a range of workows become accessible. These allow probing of subunit connectivity (complex-up), efficient ligand identication and monomer sequencing (complexdown), and elucidation of secondary and tertiary structure (native TD). The most commonly used ion activation methods in these experiments are summarised in Fig. 8, along with the structural levels they are typically used to probe. These recent developments in MS technology are complemented by advances in native separation strategies. These will become essential, as the future of this eld will include analysis of native proteins from complex systems such as cells and tissues. We expect both separation and MS methods to see further development in the near future to address even larger assemblies and heterogeneous mixtures. Ongoing development of TDMS soware will soon enable identication of all peaks present in native MS, complexdown, and native TDMS spectra. Furthermore, the top-down MS eld has historically been to some extent dominated by a relatively small group of laboratories, based on the sophisticated equipment (in many cases custom modied mass spectrometers) and expertise required. However, the multinational Consortium for Top-Down Proteomics has launched an initiative aimed at making these workows accessible to more labs. As instruments, methods, and knowledge become more accessible, we expect to see these methods being adopted by signicantly more researchers in the next few years.

Conflicts of interest
There are no conicts to declare.