Introduction

Fluorescence microscopy is a powerful tool in biological research due to its live-cell compatibility and labeling specificity, allowing researchers to examine the molecular distribution and organelle morphology and dynamics in a diverse range of biological model organisms. However, conventional far-field fluorescence microscopy is constrained by the diffraction limit of light with a spatial resolution of 200 nm laterally and 500 nm axially, making it challenging to resolve cellular and tissue constituents at the nanoscale. Single-molecule localization microscopy (SMLM), typically including fluorescence photoactivation localization microscopy (PALM/fPALM) [1, 2], stochastic optical reconstruction microscopy (STORM) [3], direct STORM (dSTORM) [4, 5], DNA points accumulation for imaging in nanoscale topography (DNA-PAINT) [6, 7], and minimal photon fluxes (MINFLUX) [8, 9], has overcome this barrier by randomly activating a small subset of molecules to fluoresce at any given time, enabling the precise localization of their isolated emission patterns and the super-resolution reconstruction through the accumulation of locations of thousands to millions of single molecules (Fig. 1A) [10,11,12]. This groundbreaking approach can enhance resolution by a factor of ten, revealing the complex details of molecular activities and structures in cells and tissues. It also has the potential to advance research in disease mechanisms, diagnostics, and treatments, such as cancers [13,14,15,16], Alzheimer’s disease [17,18,19], and Parkinson’s disease [20,21,22].

As an extension of SMLM, multicolor SMLM images distinct protein species inside the specimen labeled with various fluorescent probes, aiming to not only elucidate the structural intricacies of individual targets but also uncover the spatial relationships and interactions among different fluorescently tagged targets [23,24,25]. While this technique has garnered significant attention for exploring organelle interactions, biologists will find it overwhelming to navigate among its numerous variations. This review exclusively focuses on the recent advancements of multicolor SMLM, analyzing the strengths and limitations of each variant while also discussing the future prospects.

Fig. 1
figure 1

Concept of multicolor SMLM. A Schematic of SMLM. B Simultaneous multicolor SMLM including ratiometric SMLM, spectroscopic SMLM, point spread function (PSF) engineering SMLM, and excitation modulation SMLM. The green and red dashed boxes show the excitation and emission paths, respectively. The dark boxes show typical emission patterns of single molecules, partly reproduced with permission from Ref. 97, copyright 2016 Springer Nature. SLM: spatial light modulator

Approaches in multicolor SMLM

Based on the methodology of image acquisition, current multicolor SMLM approaches can be categorized into two strategies: sequential acquisition and simultaneous acquisition. Sequential acquisition involves capturing one-color data at a time through sequential imaging of each structure, while simultaneous acquisition involves capturing multicolor data simultaneously and distinguishing them through post-processing algorithms (Fig. 1B).

Sequential multicolor SMLM

Current sequential multicolor SMLM techniques mainly include sequential multicolor SMLM with multiple fluorophores, sequential multicolor SMLM with a single fluorophore, and exchange-PAINT.

Sequential multicolor SMLM with multiple fluorophores

The most straightforward approach to achieve sequential multicolor SMLM involves labeling different protein species with dyes featuring well-separated excitation spectra and sequentially imaging them using different excitation lasers (Fig. 2A). This method was initially proposed by Bock et al. [26] for SMLM in 2007, where the fluorescent protein rsFastLime and the organic fluorophore Cy5 were excited at wavelengths of 488 nm and 633 nm, respectively, allowing for two-color imaging of microtubular network in fixed PtK2 cells (Fig. 2B). The same year, Shroff et al. [27] performed two-color super-resolution imaging of various pairs of proteins assembled in adhesion complexes and discovered that proteins appearing to be colocalized in conventional microscopy were actually resolved as distinct interlocking nano-aggregates in super-resolution microscopy. Subsequently, this technique gained widespread adoption and facilitated significant biological discoveries through unveiling the spatial organization of multiple biomolecules such as the eight-fold symmetry of the gp210 proteins in nuclear pore complex (NPC) (Fig. 2C) [28], the process of determining the kinetics of target recognition mediated by in vivo base pairing (Fig. 2D) [29], and the type III secretion machine in real-time within S. Typhimurium bacteria (Fig. 2E) [30]. This method is relatively easy to implement and exhibits minimal crosstalk between different channels. However, the use of well-separated excitation spectra results in well-separated emission spectra, leading to substantial chromatic aberrations that necessitate precise registration between different channels to maintain nanoscale resolution [31]. Furthermore, fluorophores outside the far-red channel may not perform optimally in terms of duty cycle (the ratio of the time the fluorophore stays in the fluorescent state to the total time of a complete cycle including both the fluorescent and dark states) and emitted photons per emission under specific imaging buffer conditions, potentially compromising localization precision or density. To mitigate these challenges, an alternative strategy of sequential multicolor SMLM using fewer or even a single fluorophore (Fig. 3A) has been proposed.

Fig. 2
figure 2

Sequential multicolor SMLM with multiple fluorophores. A Schematic of sequential multicolor SMLM with multiple fluorophores. B Image of microtubular network in PtK2 cells labeled with rsFastLime (green) and Cy5 (red). Reproduced with permission from Ref. 26. Copyright 2007 Springer Berlin Heidelberg. C Image of NPCs using ATTO520 (green) and AF647 (magenta) labeled secondary antibodies directed against an epitope of gp210 on the luminal side, with the eight-fold symmetrical ring structure of gp210 surrounding the NPC and N-acetyl glucosamine-containing nucleoporins. Reproduced with permission from Ref. 28. Copyright 2012 Company of Biologists Ltd. D Image of SgrS (red) and ptsG mRNA (green) labeled by smFISH, showing the kinetic properties of SgrS regulation of ptsG mRNA. Reproduced with permission from Ref. 29. Copyright 2015 American Association for the Advancement of Science. E Image of fixed S. Typhimurium expressing mEos3.2-SpaO (green) stained with an AF647-labeled antibody directed to an epitope tag present in the type III secretion needle tip protein SipD (magenta). Reproduced with permission from Ref. 30. Copyright 2017 the National Academy of Sciences of the United States of America

Sequential multicolor SMLM with a single fluorophore

Multicolor STORM utilizing photo-switchable fluorescent probes was proposed by Bates et al. [32] in 2007, each of which consists of a ‘reporter’ that can be toggled between fluorescent and dark states and an ‘activator’ that triggers the activation of the reporter (Fig. 3B). By linking the ‘reporter’ Cy5 to three different ‘activators’ – AF405, Cy3, and Cy2 – distinct activation spectra were created, enabling sequential color-specific activation to realize three-color super-resolution imaging. Subsequently, in 2010 Dani et al. [33] achieved three-color three-dimensional (3D) imaging using the same ‘activators’ but a different ‘reporter’, AF647, allowing the visualization of presynaptic Bassoon and postsynaptic Homer1 in the main olfactory bulb of the mouse (Fig. 3C). In 2012, Bates et al. [34] utilized the same ‘activators’ along with two ‘reporters’, AF647 and AF750, thereby creating six fluorescent probe pairs for six-color imaging. This method not only reduces the number of fluorophores required for multicolor SMLM but also enables the flexible selection of infrared dyes with superior blinking capabilities, facilitating the acquisition of high-quality data in each activated channel. However, it suffers from high crosstalk of 10% – 20% due to non-specific activation by the laser.

Fig. 3
figure 3

Sequential multicolor SMLM with a single fluorophore. A Schematic of sequential multicolor SMLM with a single fluorophore. B Schematic of sequential multicolor SMLM based on photo-switchable fluorescent probes constructed from activator-reporter pairs. C Image of presynaptic Bassoon (red) and postsynaptic Homer1 (green) labeled with Cy3-AF647 and AF405-AF647, respectively, in the glomerular layer of the main olfactory bulb in the mouse (top), side-view image (bottom left) with the trans-synaptic axis rotated into the viewing plane, and face-view image (bottom right) with transsynaptic axis rotated perpendicular to the viewing plane. Reproduced with permission from Ref. 33. Copyright 2010 Cell Press. D Schematic of sequential multicolor SMLM based on fluorescence quenching, including labeling, imaging, photo-destruction, relabeling, etc. E Image of mitochondrial outer membrane protein TOM20 (yellow), mitochondrial inner membrane protein ATP-synthase (cyan), lysosomal protein Lamp2 (red), total tubulin (green), and acetylated tubulin (magenta) in BS-C-1 cells, all labeled with AF647, showing that the acetylated tubulin colocalizes with total tubulin and ATP-synthase colocalizes with TOM20 while Lamp2 does not colocalize with either total tubulin or TOM20. Reproduced with permission from Ref. 35. Copyright 2014 Public Library of Science. F Image of clathrin (yellow), α-tubulin (green), actin (orange), and EGFR (blue) in HeLa cells, all labeled with AF647. Reproduced with permission from Ref. 36. Copyright 2015 Public Library of Science

To lower the crosstalk and simplify the imaging methodology, multicolor SMLM through sequential labeling was developed by Tam et al. [35] in 2014, utilizing a single fluorophore, AF647, to label a variety of primary antibodies that target proteins of interest. This approach involves labeling the first target, acquiring single-molecule data, quenching the specimen with NaBH4, and then proceeding to label the second target, and so forth (Fig. 3D). A key challenge faced by this technique is the precise relocation of the same region of interest after each round of labeling. To overcome this obstacle, the authors initially implemented a ‘virtual grid’ for coarse alignment, recording the coordinates of the imaged region and two reference points to calculate the rotation angle between imaging sessions, and then compared images containing fluorescent beads from the two imaging sessions for fine alignment. The authors achieved five-color imaging in BS-C-1 cells, targeting proteins such as mitochondrial outer membrane protein TOM20, mitochondrial inner membrane protein ATP-synthase, lysosomal protein Lamp2, total tubulin, and acetylated tubulin (Fig. 3E). However, the process of labeling with primary and secondary antibodies can lead to an increased linkage error. To reduce this error, in 2015 Valley et al. [36] proposed a method that labeled specimens with AF647 using directly-conjugated primary antibodies and utilized the bright-field image as a reference for pre-imaging alignment across different imaging sessions. In this study, the authors found that only 85% of AF647 was permanently quenched by NaBH4 and thus implemented a combination of photobleaching and NaBH4 quenching, effectively reducing crosstalk to less than 0.5%. This technique enables the imaging of clathrin, α-tubulin, actin, and epidermal growth factor receptor (EGFR) in HeLa cells (Fig. 3F). This sequential labeling method eliminates chromatic aberrations, avoids the incompatibility of different fluorophores in the same imaging buffer, reduces instrumentation costs by requiring only one excitation laser and one activation laser, and lowers crosstalk through the reliable performance of NaBH4 in quenching fluorescence. However, the process needs the removal of the sample for quenching and labeling, followed by its return to the original position multiple times, which relies heavily on the expertise of the experimentalist and can be time-consuming and labor-intensive. Automation devices incorporating motor-controlled and piezo stages along with microfluidics technology have been implemented to simplify buffer exchange, relabeling, and washing steps, notably streamlining sample preparation and imaging procedures [37, 38].

Exchange-PAINT

Exchange-PAINT is presented by Jungmann et al. [7] in 2014 based on DNA-PAINT [6], which leverages the programmability and specificity of DNA hybridization to design a docking strand that connects to the target and a fluorescently labeled imaging strand, and generates the blinking data from the binding and dissociation of these strands. By sequentially exchanging the buffers of imaging strands to bind to different targets (Fig. 4A), the authors accomplished the first ten-color super-resolution imaging on synthetic DNA structures with sub-10-nm resolution. This technique theoretically enables unlimited multiplexing but is hindered by prolonged acquisition time. Increasing the concentration of the imaging strand can enhance the binding frequency and expedite the imaging process, but leads to a higher number of unbound strands, which contributes to a higher background level.

Fig. 4
figure 4

DNA-PAINT and its variants. A Schematic of Exchange-PAINT. B Schematic of FRET based DNA-PAINT. C Schematic of caged DNA-PAINT. D Schematic of fluorogenic DNA-PAINT. E Effect of ethylene carbonate, repeat sequence, and spacer. Reproduced with permission from Ref. 46. Copyright 2020 Springer Nature. F Image of a mixed synapse containing VGlut1 (yellow) as a neurotransmitter transporter, Bassoon (red) as presynaptic, and Gephyrin (green) as postsynaptic marker. Reproduced with permission from Ref. 48. Copyright 2024 Cell Press

To solve this problem, in 2017 Auer et al. [39] and Lee et al. [40] independently proposed probes based on Förster resonance energy transfer (FRET), where the fluorescence of the acceptor strand was detected only when it bound to an excited donor (Fig. 4B). In 2020, Jang et al. [41] developed photoactivatable probes using reductive caging in DNA-PAINT, which employed chemically reductive imager strands and UV-TIRF (total internal reflection fluorescence) illumination to activate the imager strands only in close proximity to the coverslip surface (Fig. 4C). In 2021, Geertsema et al. [42] proposed to employ left-handed DNA instead of conventional right-handed DNA, as left-handed DNA does not hybridize with natural right-handed DNA, minimizing interference from cellular DNA during the hybridization between the target and probe. In 2022, Chung et al. [43] developed fluorogenic DNA-PAINT utilizing self-quenching single-stranded probes conjugated with a fluorophore and quencher at their terminals, which are strongly quenched in solution but become bright upon binding to the docking strand (Fig. 4D). All these methods can effectively reduce the background without affecting the imaging speed.

Another strategy to accelerate imaging involves increasing the association rate among individual imaging chains. In 2019, Schueder et al. [44] enhanced the hybridization kinetics in DNA-PAINT by optimizing the sequence and adjusting the concentration of MgCl2 in the solution, achieving a ten-fold improvement in acquisition speed. In 2020, Strauss et al. [45] employed overlapping repetitive sequences to achieve an impressive 100-fold acceleration, reducing the imaging time to just 30 min for six targets. Similarly, Civitci et al. [46] improved the binding rate of the imaging strand by incorporating ethylene carbonate into the buffer, adding repeat sequences to the docking strand, and introducing a spacer between the docking strand and the affinity agent (Fig. 4E), and achieved multiplexed imaging in 2–5 min for each target. In 2021, Clowsley et al. [47] introduced a technique known as repeat DNA-PAINT, which involves the incorporation of multi-repeat docking motifs to enhance the imager binding sites, significantly decreasing the background while increasing the acquisition speed. The above improvements optimize the acquisition speed of DNA-PAINT, but the optimized sequence constrains its multiplexing capability to a limited target. In 2024, Unterauer et al. [48] proposed secondary label-based unlimited multiplexed DNA-PAINT (SUM-PAINT), an infinite multiplexing based on secondary labeling, which enables the imaging of 30 proteins in neurons with a resolution higher than 15 nm. With this technique, the authors unveiled three distinct synaptic subtypes within hippocampal neurons: canonical glutamatergic excitatory and GABAergic inhibitory synapses, and a mixed synaptic subtype (Fig. 4F).

Despite the incompatibilities of DNA-PAINT-based approaches with live-cell imaging, the numerous inventions facilitate highly-multiplexed, low-crosstalk, and robust investigations of a wide range of protein species in their native and complex cellular and tissue context.

Simultaneous multicolor SMLM

Current simultaneous multicolor SMLM techniques mainly consist of ratiometric SMLM, spectroscopic SMLM, PSF engineering SMLM, and excitation modulation SMLM.

Ratiometric SMLM

Different fluorophores, even when imaged simultaneously in SMLM, can be distinguished based on their unique emission spectra. The idea of ratiometric SMLM was initially proposed by Schönle et al. [49] in 2007, where multiple fluorophores with closely overlapping spectra were excited using a single laser and differentiated based on the transmission-reflection ratio of the emission signal after passing through a dichroic mirror (Fig. 5A). This approach was later validated by Bossi et al. [50] in 2008, where the microtubules and keratin network in PtK2 cells were imaged with secondary antibodies labeled with SRA552-maleimide and SRA577-NHSS, respectively, achieving a lateral resolution of 10–15 nm with a crosstalk of approximately 5%.

AF647, known for its low duty cycle and high photon numbers per emission, is a widely-used commercial fluorophore in SMLM. Therefore, the following work targeted on discriminating between AF647 and other dyes such as AF750 [51], AF700 [52], CF680 [53], and DY678 [54]. As mentioned above, selecting spectrally suitable fluorescent dyes for SMLM has remained a challenge as using well-separated emission spectra results in low crosstalk (typically 1% [52]) but large chromatic aberrations, whereas using closely overlapping emission spectra eliminates chromatic aberrations but demands more complex algorithms to reduce the crosstalk. In 2015, Lampe et al. [54] proposed a unique pair-finding algorithm for color assignment and achieved two-color 3D imaging of microtubules and clathrin heavy chain in NIH 3T3 cells labeled with AF647 and DY678, respectively, with a lateral resolution of 25 nm, an axial resolution of 66 nm, and a crosstalk within 2%. In 2022, Siemons et al. [55] presented probability-based fluorophore classification in ratiometric SMLM using three spectrally overlapping fluorophores AF647, CF660, and CF680 to image tyrosinated tubulin, vimentin, and clathrin heavy chain in COS-7 cells, respectively, achieving a crosstalk within 1%. However, this method only utilizes a portion of collected photons for localization, leading to a decrease in spatial resolution. Andronov et al. [56] proposed splitSMLM, where dichroic mirrors and filters were selected to provide similar spectral widths to two channels and a weighted averaging localization algorithm was developed allowing the use of all detected photons for spectral separation and spatial localization. The authors demonstrated three-color imaging of NPCs in U2OS cells labeled with AF647, CF660C, and CF680 with a resolution of 20 nm and a crosstalk within 2%, successfully refining the positioning of individual NPC proteins and revealing that Pom121 clusters act as NPC deposition loci. Li et al. [57] proposed globLoc, a global fitting algorithm that utilizes flexible PSF modeling and parameter sharing to maximize the information extracted from multicolor single-molecule data. The authors achieved four-color 3D imaging of Nup62, Nup96, ELYS, and WGA within single NPCs labeled with DY634, AF647, CF660C, and CF680, respectively, without apparent crosstalk (Fig. 5B), representing the highest number of colors in ratiometric SMLM.

Fig. 5
figure 5

Ratiometric SMLM. A Setup diagram of ratiometric SMLM in single-objective microscopy (left), and simulated images of different fluorophores acquired in reflected and transmitted paths (right). B Image of Nup96 (yellow), ELYS (red), Nup62 (cyan), and WGA (magenta) within single NPCs, labeled with AF647, CF660C, DY634, and CF680, respectively. Reproduced with permission from Ref. 57. Copyright 2022 Springer Nature. C Image of microtubules (cyan), vimentin (yellow), and clathrin (magenta) in COS-7 cells, labeled with ATTO655, ATTO680, and ATTO700, respectively. Reproduced with permission from Ref. 59. Copyright 2022 American Chemical Society. D Image of clathrin (yellow) and tubulin (red) in COS-7 cells, labeled with Cy3B and ATTO643, respectively. Copied with permission from Ref. 60. Copyright 2023 Elsevier. E Setup diagram of salvaged fluorescence 4Pi-SMLM. (F) Setup diagram of ratiometric SMLM in 4Pi microscopy

To reduce linker length due to the finite size of primary and secondary antibodies, in 2015 Platonova et al. [58] utilized nanobodies to deliver dyes to fluorescent protein fusion constructs, allowing for accurate labeling of cellular structures in tens of nanometers with minimal linkage error. The authors cotransfected the protein Caveolin1-EGFP and Caveolin1-mCherry in U2OS cells and discriminated two cellular structures less than 50 nm apart. DNA-PAINT can provide nanometer localization precision due to its high photon counts and immunity to photobleaching, but sequential imaging with buffer exchange leads to complex experimental operation and extended acquisition time. In order to utilize the benefits and avoid the shortcomings of DNA-PAINT, in 2022 Gimber et al. [59] combined ratiometric imaging with DNA-PAINT allowing for simultaneous three-color imaging. This approach, named as spectral demixing DNA-PAINT (SD-DNA-PAINT), follows the previous pair-finding algorithm [54] and realizes the imaging of microtubules, vimentin and clathrin in COS-7 cells labeled with ATTO655, ATTO680, and ATTO700, respectively, with a resolution of 7–14 nm and a crosstalk within 5% (Fig. 5C). SD-DNA-PAINT shortens the acquisition time by three-fold for three-color imaging compared to Exchange-PAINT [7], and achieves higher localization precision and more imaging channels compared to SD-dSTORM [54]. Similarly, in 2023, Friedl et al. [60] proposed simultaneous two-color PAINT (S2C-PAINT), which combined ratiometric and astigmatic imaging [61] with DNA-PAINT and achieved two-color 3D imaging of clathrin and tubulin in COS-7 cells labeled with Cy3B and ATTO643, respectively, with a crosstalk lower than 4% and an imaging depth up to 1 μm (Fig. 5D). The primary advantage of S2C-PAINT is its straightforward extension to multiplexing with four or more channels, at the expense of reduced imaging speed.

Ratiometric SMLM can be integrated with innovative physical elements to enhance its performance. In 2020, Vissa et al. [62] proposed to use a thin film tunable filter (TTF) instead of a dichroic mirror for spectral separation, allowing for the flexible selection of the wavelength range passing through by adjusting the incidence angle. The authors imaged the mitochondrial protein TOM20 and the peroxisomal protein PMP70 in HeLa cells labeled with AF647 and CF680, respectively, achieving a lateral resolution of 75–80 nm. By implementing high photon output threshold and density filtering, this approach effectively reduces the intensity and number of undesired localizations in each channel, coming at the cost of up to 12% photon loss due to the inherently narrow bandwidth of the TTF. In 2021, Wang et al. [63] proposed a two-color SMLM method utilizing a colorimetry camera equipped with five types of pixels: red, green, blue, far infrared, and white. This method determined the color of fluorophores based on the intensity ratio of colored pixels, while localizing single molecules using white pixels. The authors imaged microtubules and mitochondria in COS-7 cells labeled with DL633 and CF680, respectively, with a lateral resolution of 20 nm and a crosstalk of 2%. This approach only requires one camera for multicolor SMLM, significantly simplifying the optical setup and offering convenient multi-color imaging capabilities even for inexperienced users.

The above-mentioned studies are all developed for single-objective SMLM. For 4Pi-SMLM [64,65,66] that uses two opposing objectives in a so-called 4Pi geometry and couples this interferometric detection realizing 3D resolution down to 10 nm, it is quite challenging to insert any other beam splitting element into the sophisticated emission configuration. To address this issue, in 2020 Zhang et al. [31] proposed salvaged fluorescence, where an additional dichroic mirror was inserted into the excitation path to ‘salvage’ the originally unused fluorescence for color identification without disturbing the emission path (Fig. 5E). The authors achieved two-color imaging of endoplasmic reticulum membrane and microtubules in COS-7 cells labeled with AF647 and CF660C, as well as three-color imaging of cis, medial, and trans Golgi proteins in HeLa cells labeled with DY634, DL650, and CF680, with a 3D resolution of around 20 nm and a crosstalk below 2%. In 2022, Chen et al. [67] performed ratiometric imaging in 4Pi-SMLM with a further simplified configuration, where two identical filters were inserted into two of the total four detection arms to generate intensity difference for color recognition (Fig. 5F). This method has fewer instrumentation modifications and, as the authors claimed, higher photon collection efficiency compared to salvaged fluorescence, thereby offering an improved localization precision.

Ratiometric SMLM only requires one excitation laser which simplifies the instrumentation. Meanwhile, utilizing far-red fluorescent probes with a high photon count per emission, low background, and a moderate number of flickers ensures high localization precision. However, the overlapping spectra cannot be completely distinguished in this type of method, leading to crosstalk between different colors, especially in cases of low photon budget. One alternative solution is to extract additional information from the emission patterns to assist with color assignment, based on the observation that PSFs corresponding to longer wavelengths exhibit a lower cutoff frequency and a larger shape [68, 69].

Spectroscopic SMLM

Spectroscopic SMLM (sSMLM) incorporates dispersive elements such as prisms and gratings in the emission path to introduce spectrum-dependent elongation, and thus encodes spectrum information in single-molecule emission patterns.

In 2015, Zhang et al. [70] introduced a prism-based single-molecule spectrum imaging method in a dual-objective microscope, where a prism was placed in one of the detection arms to encode spectrum information in the dispersed single-molecule emission patterns for color identification, and the other arm was utilized for spatial localization as usual. This approach was employed to image peroxisomes, vimentin filaments, microtubules, and mitochondrial outer membrane in PtK2 cells labeled with DY634, DL650, CF660C, and CF680, respectively, achieving a spectral resolution of 10 nm with a crosstalk within 2% (Fig. 6A). Although this configuration is capable of detecting up to four colors simultaneously in fixed cells, its horizontal implementation is not suitable for live-cell imaging. To address this issue, in 2022 Bulter et al. [71] conducted prism-based multicolor imaging using an inverted microscope, employing a bottom oil-immersion objective for spatial localization and a top water-dipping objective for spectral measurement, which simplified the sample mounting by eliminating the need to position samples between two closely spaced coverslips. The authors demonstrated three-color 3D imaging of mitochondria, microtubules, and the nucleus envelope in COS-7 cells, labeled with Cy3B, ATTO647N, and ATTO700, respectively. In 2016, Mlodzianoski et al. [72] developed prism-based spectrum imaging in a single-objective microscope using a 50:50 beam splitter to split the florescence into two paths, with a prism inserted in one of them. With this method the authors observed ‘spectrum wandering’, a phenomenon that, when combined with localization, can reveal nanoscale variations in the molecular environment. In 2017, Moon et al. [73] demonstrated the applicability of a similar optical setup for live-cell imaging. However, these configurations require multiple discrete optical components with stringent optical alignment requirements. In 2022, Song et al. [74] designed a dual-wedge prism-based spectrometer (Fig. 6B), which significantly simplified the optical setup by the monolithic construction and enhanced spatial and spectral transmission efficiencies through meticulous material selection, precise dimensioning, and anti-reflection coating. The authors enabled the achievement of a spatial resolution of 10 nm and a spectral resolution of 4.5 nm within a 2000 photon budget, along with a photon transmission efficiency of 40%. This approach facilitates seamless integration into conventional microscopes, thereby democratizing the adoption of sSMLM across diverse users in the biological research community. The utilization of prism-based sSMLM has also extended to applications such as single-particle tracking [75,76,77].

Fig. 6
figure 6

Spectroscopic SMLM. A Schematic of sSMLM based on the prism and image of peroxisomes (green), vimentin filaments (magenta), microtubules (yellow), and mitochondrial outer membrane (cyan) in PtK2 cells, labeled with DY634, DL650, CF660C, and CF680, respectively. Reproduced with permission from Ref. 70. Copyright 2015 Springer Nature. B Schematic of sSMLM based on the double-wedge prism. C Schematic of sSMLM based on the reflection grating and image of microtubules (orange) and mitochondria (green) in COS-7 cells, labeled with AF568 and Mito-EOS 4b, respectively, and auto-fluorescent spot (blue) from background. Reproduced with permission from Ref. 78. Copyright 2016 Springer Nature. D Schematic of sSMLM based on the transmission grating. Resulted PSFs for spatial localization and color identification are shown nearby the setup diagram, with a scale bar of 2 μm

Another widely-used dispersive elements, diffraction gratings, can also be utilized in sSMLM. In 2016, Dong et al. [78] proposed sSMLM based on gratings, where a reflection grating with a period of 150 lines/mm was used to generate the 0th and 1st diffraction order images with an intensity ratio of 1:3. The 0th order image was used for spatial localization and the 1st order image was used for spectral identification. The authors imaged microtubules and mitochondria in COS-7 cells labeled with AF568 and Mito-EOS 4b, respectively, with a spatial resolution of 25 nm and a spectral resolution of 0.63 nm/pixel (Fig. 6C). However, in this method only a quarter of the photons were utilized for localization, leading to a 50% reduction in localization precision. To further enhance spatial resolution and mitigate photon loss in transmission gratings, the same year Bongiovanni et al. [79] introduced the use of a blazed transmission diffraction grating with a period of 300 lines/mm in sSMLM, which was placed in front of the camera to differentiate the 0th and 1st orders, maintaining an intensity ratio of 3:2. The authors used Nile red’s hydrophobic sensitivity to monitor environmental changes in human epithelial cells, achieving a spatial resolution of 42 nm and a spectral resolution of 9 nm. This method can super-resolve the hydrophobicity of amyloid aggregates associated with neurodegenerative diseases as well as the hydrophobic alterations occurring in mammalian cell membranes. While this method enhances photon usage for localization, not all localizations can be effectively utilized in spectral analysis, degrading the spatial resolution. In 2019, Song et al. [80] introduced the biplane configuration [81] in sSMLM, involving a pair of mirrors to create an optical distance difference without the need of additional optical elements. The authors successfully imaged microtubules and mitochondria in COS-7 cells labeled with AF647 and CF660C, respectively, achieving a lateral resolution of 47 nm and an axial resolution of 118 nm with an average photon count of 550. In 2020, Song et al. [82] introduced a symmetrically dispersed sSMLM approach, where a transmission grating with a period of 80 lines/mm was utilized to generate ± 1st diffraction orders, thereby creating symmetrical images (Fig. 6D). In this method, spatial information was extracted by pinpointing the midpoint of the two symmetrical spectral images, while spectral information was derived from the spectral shift distance calculation. This method, compared to the previous technique [78], resulted in a 42% improvement in spatial precision and a 10% enhancement in spectral precision, achieving a spatial resolution of 25 nm and a spectral resolution of 1.9 nm within a 1000 photon budget at the expense of 28.5% photon loss in the 0th order. The authors imaged microtubules and mitochondria in COS-7 cells labeled with AF647 and CF680, respectively, with a spatial resolution of 66 nm. To further increase the localization density, in 2022 Martens et al. [83] proposed high-density sSMLM by simply incorporating a grating with a period of 70 lines/mm in front of the camera, which can differentiate fluorophores with peak emission wavelengths less than 15 nm apart, leading to a five-fold increase in emitter density. This method, characterized by its inherent simplicity and photon efficiency, is well-suited for applications requiring effective photon-based separation of spectrally distinct entities, such as in low-signal flow cytometry.

Deep learning can be combined with sSMLM to improve the imaging speed, as well as the spatial and spectral resolution. In 2019, Zhang et al. [84] developed a neural network-based spectral classification method to reduce color misassignment rates, achieving a classification accuracy of 99.8% and a lateral resolution of 47 nm in imaging tubulin and mitochondria in COS-7 cells labeled with AF647 and CF660, respectively. In 2020, Gaire et al. [85] developed a deep-learning algorithm to reconstruct high-density super-resolution images from low-density images, achieving an 8-fold reduction in data acquisition time for two-color imaging of peroxisome and mitochondria in fixed COS-7 cells and a 6.67-fold reduction for three-color imaging of tubulin, mitochondria, and peroxisome in fixed U2OS cells without compromising the spatial resolution. In 2023, Manko et al. [86] proposed srUnet, a U-net-based spectral image processing method to enhance spectral and spatial signals and compensate for photon loss, achieving a spectral resolution of 4.5 nm and a spatial resolution of 6 nm as compared to 9 nm for the raw data, in a 1000 photon budget. In 2024, Gaire et al. [87] developed a computational method for low-photon budget scenarios, which utilized a two-network model including a U-net for spatial PSF localization and a deep convolution neural network for spectral PSF enhancement. The authors reconstructed the spatial organization of immunofluorescence-labeled histone markers, achieving a localization count that is 8.8% higher than that obtained through conventional sSMLM reconstruction.

In comparison to ratiometric SMLM, spectroscopic SMLM provides detailed spectral information of fluorophores rather than solely distinguishing colors, with minimal crosstalk across different channels. Using extensively broadened emission patterns enhances spectral resolution and color recognition accuracy but reduces the photon count per pixel, thereby lowering the localization precision while also demanding a lower emitter density to prevent overlap. Therefore, achieving a balance between spectral and spatial resolution is crucial for optimizing the performance of sSMLM techniques. Additionally, the efficiency of diffraction orders should be considered when using a grating for dispersion, as this can lead to photon loss and subsequently deteriorate localization precision.

PSF engineering SMLM

An experimental emission pattern, known as the PSF, carries a wealth of information. For instance, PSFs associated with longer wavelengths demonstrate a lower cutoff frequency and a broader shape, which can be leveraged for color recognition in multicolor SMLM imaging [68, 69]. However, this distinction is too subtle to be accurately identified without the application of deep-learning-based algorithms. The PSF engineering approach, commonly employed for axial localization in SMLM, encodes the axial positions of single molecules in the form of their PSFs by utilizing various techniques such as a cylindrical lens [61], prefabricated dielectric mask [88, 89], liquid crystal spatial light modulator (SLM) [90,91,92,93], or deformable mirror [94, 95]. With smart design, this method can also be adapted to facilitate multicolor SMLM imaging.

In 2014, Broeken et al. [96] proposed a method for simultaneously measuring the position and color of single molecules, which used an SLM to create a large-pitch grating with a period of 1.2 mm to generate side lobes around the main lobe of the PSF (Fig. 7A). The spacing between the main and side lobes exhibited a linear relationship with the emission wavelength, enabling the differentiation of closely positioned spectra such as QD605, QD655, and QD705 with misidentification rates of 5.6%, 22%, and 8.9%, respectively.

Fig. 7
figure 7

Multicolor SMLM based on PSF engineering. A Setup diagram of PSF engineering based on a large-pitch grating and measured PSFs of beads. Reproduced with permission from Ref. 96. Copyright 2014 Optica Publishing Group. SLM: spatial light modulator. B Setup diagram of two-channel PSF engineering and simulated PSFs of microspheres. Reproduced with permission from Ref. 97. Copyright 2016 Springer Nature. C Setup diagram of multiplexed PSF engineering and measured PSFs of microspheres. Reproduced with permission from Ref. 98. Copyright 2021 American Chemical Society. D Setup diagram of Circulator and detected PSFs for different fluorophores. QWP: quarter-wave plate. PBS: polarized beam splitter. PCE: polarization-compensating element. Reproduced with permission from Ref. 100. Copyright 2024 Springer Nature. E Image of microtubules (yellow), vimentin (blue), and clathrin (red), labeled with ATTO643, AS488, and Cy3B, respectively. Reproduced with permission from Ref. 100. Copyright 2024 Springer Nature

In 2016, Shechtman et al. [97] proposed two-color volumetric SMLM imaging based on PSF engineering (Fig. 7B), where an SLM was utilized to create a phase mask that imparts distinct phase delays for different wavelengths, resulting in varying PSFs from red and green fluorophores at different axial positions. The authors achieved two-color imaging of microtubules and mitochondria in BS-C-1 cells, labeled with AF647 and AF532, respectively, with a lateral resolution of 50 nm. This innovative method paves the way for applying PSF engineering in multicolor SMLM, yet it has several drawbacks. Firstly, the SLM is limited to modulating only s-polarization, leading to a 50% photon loss. Secondly, the specific wavelength design may not match the emission spectra of many fluorophores commonly used in SMLM. Building upon this concept, in 2021 Opatovski et al. [98] introduced multiplexed PSF engineering (Fig. 7C), where two dichroic mirrors were used to separate three channels, each incorporating a unique phase mask to shape PSFs in a tetrapod configuration with varying orientations. By implementing this method, the PSFs can be tailored with increased flexibility and reduced photon loss, all while maintaining the field of view. This method represents a promising advancement for multicolor localization and optimizing photon utilization efficiency, although it was developed for single particle tracking.

In 2019, Hershko et al. [99] introduced a novel approach consisting of two components of a neural network, the first of which was trained with raw data from two-color quantum dots to enable color discrimination, while the second one incorporated an SLM optimizer and a reconstruction network to enhance aberrations for precise localization and accurate color identification. Through this methodology, the authors achieved an impressive classification accuracy of 96.4% in quantum dots using standard PSFs, and further improved it to 99.4% with the optimized SLM pattern. The authors demonstrated the imaging results of microtubules and mitochondria in HeLa cells, labeled with AF647 and AF555, respectively.

In 2024, Van den Eynde et al. [100] introduced an innovative add-on module called Circulator to encode color information in SMLM imaging (Fig. 7D). This module utilized a polarized beam splitter to separate the emission light into two paths, with each path being transmitted or reflected by a dichroic mirror, generating a pair of split PSFs whose rotation angles corresponded to the color of the fluorophore. Using this method, the authors achieved simultaneous three-color imaging of microtubules, vimentin, and clathrin, labeled with ATTO643, AS488, and Cy3B, respectively, thereby tripling the overall acquisition speed (Fig. 7E).

While PSF engineering SMLM effectively minimizes crosstalk between different channels, it does have certain limitations. Firstly, the elongated shape of the PSFs provides detailed localization and color information but requires a substantial number of pixels on the camera chip. This requirement can complicate the application of this technique to high-density single-molecule data, which may be mitigated by incorporating dense localization algorithms [101, 102]. Secondly, extending this method to accommodate more channels presents difficulties, as designing and recognizing PSFs with distinct shapes at various wavelengths becomes increasingly complex. Employing deep-learning-based techniques for these processes offers a promising solution.

Excitation modulation SMLM

In ratiometric SMLM, spectroscopic SMLM, and PSF engineering SMLM, color assignments rely on the emission spectra of fluorophores, with the accuracy of color classification dependent on the distinct emission spectra of the dyes. The challenge of precise classification arises when emission spectra are closely spaced, a common occurrence in multicolor SMLM techniques.

To solve this problem, in 2018 Gómez-García et al. [103] introduced multicolor DNA-PAINT, termed as fm-DNA-PAINT, which employed sine-wave modulation with varying frequencies on distinct excitation lasers through acousto-optic modulators and analyzed the detected intensity of each pixel to track the brightness trend of each PSF for color assignment (Fig. 8A). The authors demonstrated two-color imaging with Cy5-labeled microtubules and Cy3-labeled mitochondria in BS-C-1 cells with a lateral resolution of 46 nm and a crosstalk of only 2.8% (Fig. 8B). This approach, relying on excitation modulation, preserves the advantages of conventional DNA-PAINT while enabling simultaneous acquisition with minimal crosstalk. This approach offers scalability in the number of colors limited only by commercially available oligo-coupled antibodies, suggesting potential for further expansion, while a persistent challenge lies in effectively managing the photophysical characteristics of multiple fluorophores within a single buffer.

Fig. 8
figure 8

Multicolor SMLM based on excitation modulation. A Setup diagram of fm-DNA-PAINT and representative example PSFs with frequency modulation. Scale bar, 250 nm. AOM: acousto-optic modulator. B Image of Cy5-labeled microtubules (green) and Cy3-labeled mitochondria (magenta) in BS-C-1 cells. Copied with permission from Ref. 103. Copyright 2018 the National Academy of Sciences of the United States of America. C Setup diagram of ExR-STORM, representative example PSFs at different excitation wavelengths, and absorption spectra of the fluorophores. AOTF: acousto-optic tunable filter. D Image of microtubules (yellow), intermediate filaments (magenta), endoplasmic reticulum (green), and mitochondrial outer membrane (cyan) in COS-7 cells, labeled with DL633, AF647, DY654, and CF660C, respectively. Reproduced with permission from Ref. 104. Copyright 2023 Springer Nature

In 2023, Wu et al. [104] introduced excitation-resolved stochastic optical reconstruction microscopy (ExR-STORM) (Fig. 8C). The methodology involves sequential illumination of the specimen with lasers emitting at nearby wavelengths selectively exciting fluorophores with distinct excitation spectra, and color assignment based on their emission response quasi-simultaneously acquired with a high-frequency galvo-mirror scanning system within a single exposure cycle [105, 106]. The authors successfully imaged microtubules, intermediate filaments, endoplasmic reticulum, and mitochondrial outer membrane in COS-7 cells labeled with DL633, AF647, DY654, and CF660C, respectively, with a lateral resolution of 19 nm, an axial resolution of 55 nm, a crosstalk within 3%, and a rejection rate within 35% (Fig. 8D). This method showcases the remarkable capability to distinguish fluorophores with emission peaks separated by as near as 5 nm, even though the use of three lasers with closely spaced wavelengths may not be the most cost-effective approach.

Excitation modulation SMLM is an innovative and promising technique for identifying the color of fluorophores with similar emission spectra. This technique offers low crosstalk and facilitates straightforward expansion to more channels, although it comes with increased complexity and higher costs in instrumentation.

Conclusions and outlook

In this review, we introduced the existing multicolor SMLM approaches (Tables 1 and 2) with the following advantages and disadvantages.

Table 1 Comparison between different multicolor SMLM methods
Table 2 Multicolor dye pairs used in SMLM
  1. (1)

    Sequential multicolor SMLM facilitates highly-multiplexed and low-crosstalk investigations of biological structures while suffering from prolonged acquisition time.

  2. (2)

    Ratiometric SMLM features simple instrumentation and is compatible with far-red fluorophores that exhibit excellent blinking behaviors, but it is essential to address concerns regarding crosstalk suppression to ensure accurate color assignment.

  3. (3)

    Spectroscopic SMLM provides detailed spectral information but needs a tradeoff between spectral and spatial resolution.

  4. (4)

    PSF engineering SMLM is a promising advancement for enhancing multicolor localization and optimizing photon efficiency, at the expense of potential overlap of elongated emission patterns.

  5. (5)

    Excitation modulation SMLM has a remarkable capability to distinguish fluorophores with similar emission spectra, despite the increased complexity it introduces to the instrumentation.

The development of multicolor SMLM is an emerging field aiming at creating essential and robust tools to study the spatial and temporal relationship and interactions of cellular and tissue constituents, especially at the subcellular and organelle level. The biological impact of the researches in this exciting field is already significant and we believe the following aspects will further broaden its application range towards profound biological and biomedical challenges:

  1. (1)

    Extension to dynamic imaging, including probe development for live cells [107,108,109,110] and color assignment for dense data [111,112,113], can facilitate the simultaneous visualization of multiple protein species with precise spatial and temporal resolution.

  2. (2)

    Extension to large field-of-view imaging, particularly through the integration of high-power homogenous illumination [114, 115], field-dependent single-molecule data analysis [116,117,118], and image stitching techniques [119, 120], can significantly enhance the throughput of biological studies.

  3. (3)

    Extension to thick tissues, including combination with adaptive optics [121,122,123,124,125,126,127,128,129], tissue clearing [130,131,132], and in situ PSF model reconstruction [133, 134], can provide valuable insights into the structure, function, and pathology of complex biological systems.

  4. (4)

    Extension to more channels, for example 5–10 channels, potentially achieved by employing appropriate dyes [135, 136], combination of multiple multicolor SMLM techniques, and advanced computational post-processing algorithms [101, 102, 137, 138], can give a detailed elucidation of the reorganization, interactions, and alterations in molecular composition within cells and tissues.

  5. (5)

    Extension to multimodal super-resolution imaging, including polarization states [139,140,141], molecular movements [142, 143], lifetime [144, 145], can help gather complementary information within one sample, allowing for a better understanding of structures and functions.

  6. (6)

    Extension to ultra-high-resolution imaging, including combination with minimal photon fluxes (MINFLUX) [8, 9], repetitive optical selective exposure (ROSE) [105, 106], or other modulation enhanced localization microscopy [146,147,148], can promote the investigation of samples with single-digit nanometer resolution.

With these developments, we anticipate that multicolor SMLM will provide unprecedented levels of details, enhancing our understanding of cellular mechanisms, protein interactions, and dynamics of molecular assemblies across a variety of biological contexts.