A new type of disordered CP12 protein in the marine diatom Thalassiosira pseudonana

CP12 is a small chloroplast protein that is widespread in various photosynthetic organisms and is often involved in the redox metabolic on/off switch of the Calvin Benson Bassham (CBB) cycle. The gene encoding this protein is conserved in many diatoms, but the protein has been overlooked in these organisms, despite their ecological predominance and their complex and still enigmatic evolutionary background.


Results
Here, we demonstrate that CP12 is expressed in the marine diatom Thalassiosira pseudonana constitutively in dark-treated and in continuous light-treated cells as well as in all growth phases. This CP12 behaves abnormally under gel electrophoresis, is heat resistant and lacks a structural core, all features of intrinsically disorder family similarly to its homologues in other species. By contrast, unlike other known CP12 proteins that are monomers, this protein is a dimer as shown by native electrospray ionization-mass spectrometry and small angle X-ray scattering. In addition, small angle X-ray scattering showed that this CP12 is an elongated cylinder with kinks. Circular dichroism spectra indicated that CP12, though it has features of disordered proteins, has a high content of α-helices. Nuclear magnetic resonance spectroscopy showed that these helices are unstable and dynamic within a millisecond timescale. Together with in silico predictions, these results suggest that T. pseudonana CP12 has both coiled-coil and disordered regions.

Conclusions
These ndings bring new insights into the large family of intrinsically disordered proteins increasing the diversity of known CP12 proteins. This raises questions about the role of this protein in addition to the well-established regulation of the CBB cycle. Background CP12 (chloroplast protein of 12 kDa) is a small nuclear encoded protein of about 80 amino acid residues, originally described by Pohlmeyer et al. [1] that occurs in many photosynthetic organisms [2,3]. In higher plants, green and red algae, and cyanobacteria, it is associated with two enzymes, phosphoribulokinase (PRK) and glyceraldehyde 3 phosphate dehydrogenase (GAPDH) from the Calvin Benson Bassham cycle that is responsible for CO 2 assimilation [4]. This ternary complex has been well-studied and its structure has been recently solved using cryo-electron microscopy in the cyanobacterium Synechococcus elongatus [5] and using X-ray diffraction in the model higher plant Arabidopsis thaliana [6]. CP12 proteins from different organisms have some highly conserved regions such as a AWD_VEEL motif [2] and in most cases have a pair of cysteine residues at the C-terminus and/or a second pair at the N-terminus. Although cysteine residues are order-promoting amino-acids, and likely to structure the molecule, CP12 shares some physico-chemical properties with intrinsically disordered proteins (IDPs). In agreement with predictors of intrinsic disorder, CP12 from the green alga Chlamydomonas reinhardtii [7] and later from the angiosperm A. thaliana [8] were shown by circular dichroism (CD) and Nuclear Magnetic Resonance (NMR) to contain little regular secondary structure in solution [9][10][11].
These proteins have been extensively studied in these organisms and in C. reinhardtii it is a jack-of-alltrades that can not only bind to PRK and GAPDH under its oxidized state, thereby downregulating their activity upon complex formation in the dark but can also perform other functions [12]. For example, it can bind metal ions [13], and can act as a speci c chaperone-like for GAPDH [14]. In the tropical legume, Stylosanthes guianensis, higher expression of CP12 increases growth, plant height and photosynthesis rate [15]. Conversely, in the tobacco Nicotiana tabacum and in the mouse-ear cress A. thaliana, antisense suppression of CP12 reduces the rate of photosynthesis and increases the expression of proteins related to oxidative stress [16,17]. Finally, in A. thaliana, one CP12 isoform is mainly expressed in nonphotosynthetic tissues (roots and oral tissues) [18]. All these features suggest that CP12 proteins have other functions beyond the dark downregulation of CBB enzymes.
Little is known about CP12 in the diatoms, an ecologically important group of microalgae. Diatoms have a more complex evolutionary background than the other organisms mentioned above in which CP12 has been studied, and the regulation of their CBB enzymes is not fully understood [19]. The complex PRK-GAPDH-CP12 does not seem to be present [20][21][22][23] though there are some studies indicating a possible presence of CP12 in these organisms [20,24]. For example, a CP12-like protein from Thalassiosira pseudonana has been shown to be expressed under stress conditions such as low CO 2 [25] or low nutrients (nitrogen, phosphorus, silicon) [26]. The aim of this manuscript was therefore to characterize the structural properties of this protein from the marine diatom T. pseudonana, using both in silico and a range of biophysical experimental approaches.

Material And Methods
Diurnal expression of T. pseudonana CP12 in vivo Cells from T. pseudonana (strain CCMP 1335 from https://ncma.bigelow.org/) were grown in F/2 + Si medium (http://www.ccap.ac.uk/) under continuous light (50 µmol.photon.m − 2 .s − 1 ) in an incubator (Innova 4230, New Brunswick Scienti c) at 19 °C and shaken at 100 rpm. Growth of T. pseudonana was monitored using the absorbance at 680 nm. When the cells reached the exponential phase, half of the culture was put in the dark, and half left in the light. After 24 hours, cells were collected by centrifugation for 15 min at 3275 g, 19 °C with a Beckman Allegra X15R centrifuge, and resuspended in 15 mM tris(hydroxymethyl)aminomethane (Tris), 4 mM ethylenediaminetetraacetic acid (EDTA), pH 7.9 with protease inhibitors (Sigma) 0.5 µg.mL − 1 . The cells were sonicated (Sonic Ruptor 250, one ice, 4 cycles, 1 min sonication and 1 min rest), then centrifuged at 16 000 g for 20 min at 4 °C and the supernatant collected. Protein amount was measured by the Bradford protein assay using bovine serum albumin as a standard (Bio-Rad). Proteins were loaded to 12% sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) that was stained with Coomassie Blue or immediately transferred onto a 0.45 µm nitrocellulose membrane (Thermo Fisher Scienti c). The antibodies raised against recombinant His-tagged CP12 in rabbits were produced by Eurogentec (https://www.eurogentec.com/en/customantibodies ). The membrane was incubated rst with α-CP12 antibody diluted 1: 10 000, then with goat anti-rabbit IgG horse radish peroxidase (HRP, Invitrogen) diluted 1: 10 000. Finally, the membrane was revealed with luminol-based substrate (Amersham Enhanced Chemiluminescence western blotting kit detection reagent) using ImageQuant LAS 4000 biomolecular imager (GE Healthcare).

Expression of CP12 during different phases of growth
A preculture of T. pseudonana cells in F/2 + Si medium, under continuous light (50 µmol.photon.m − 2 .s − 1 ), was rst grown at 19 °C, shaken at 100 rpm under high CO 2 concentration (20 000 ppm) to increase biomass. After ve days, the pellet of a 40 mL aliquot of this pre-culture obtained by centrifugation at 3275 g for 15 min at 4 °C was washed and re-suspended in fresh F/2 + Si medium. This was performed three times. These cells were then inoculated into fresh F/2 + Si medium at an initial absorbance at 680 nm of 0.2 (pathway length 1 cm) and grown under air-concentration of CO 2 (400 ppm). Growth of T.
pseudonana was monitored using the absorbance at 680 nm. Every day, a volume of the culture was collected that was normalized to obtain 30 µg of total protein according to the following equation: . A pellet of cells was obtained after centrifugation for one minute at 3275 g and re-suspended into SDS-PAGE loading buffer containing 1 mM dithiothreïtol. Cell lysis and protein denaturation were performed at 95 °C for 10 min. During the exponential phase and the beginning of the stationary phase, the expression of CP12 was monitored using western-blot analysis.
Overexpression and puri cation of CP12 from T. pseudonana Primers containing the NdeI and BamHI restriction sites were used to amplify and clone the CP12 gene in frame with the N-terminal histidine tag of the pET28a expression vector (Novagen) (forward primer 5′ CATATGGCTGCCATTGAAGCTGCTCT 3′ and reverse primer 5′ GGATCCCTAACGGGAACCAAGGGCC 3′). This plasmid was used to transform Escherichia coli BL21(DE3) pLysS. Freshly transformed bacteria were grown in 2YT medium with 50 µg/mL kanamycin and 34 µg/mL chloramphenicol at 37 °C until the absorbance at 600 nm reached 0.5 to 0.6 (1 cm pathlength). Cultures were cooled on ice for 30 min and then CP12 expression was induced with 1 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG). Cells were cultured at 30 °C overnight in an incubator (Edmund Bühler GmbH, Fisher Bioblock Scienti c), then centrifuged at 3275 g at 4 °C (Beckman Allegra X15R centrifuge). The pellets containing cells were resuspended in 50 mM NaH 2 PO 4 /Na 2 HPO 4 , 300 mM NaCl, 10 mM imidazole, pH 8.0 (Ni-NTA buffer). Cells were then broken by sonication (Sonic Ruptor 250, 1 min sonication and 1 min on ice, 4 cycles) and centrifuged at 27000 g for 20 min at 4 °C. The supernatant contained the recombinant histidine tagged CP12 (His-CP12) that was then puri ed by nickel ion a nity chromatography on Ni-NTA agarose column (Qiagen) (1.2 × 8 cm of resin). The column was equilibrated with Ni-NTA buffer. Contaminants were rstly washed out with 10 mM imidazole until the absorbance at 280 nm reached a minimum, then fractions were gradually eluted with an imidazole gradient (10 to 250 mM imidazole, 2 × 45 mL). Proteins elution was followed by absorbance at 280 nm. His-tagged CP12 was eluted with 150 mM imidazole, and dialyzed with 10 mM sodium phosphate buffer, pH 7.4 then stored at -20 °C. Size exclusion chromatography (SEC), electrospray ionization coupled to mass spectrometry (ESI-MS) and circular dichroism (CD) were performed on CP12 after His tag removal with thrombin (T4648, Sigma) (1 u for 100 µg of CP12) at room temperature for 18 h. The sample was concentrated using a 500 µL spin X-UF ultra centrifugal concentrator, Corning, 5 kDa cut-off. After His-tag removal, CP12 was stored at -20 °C.

Native electrospray ionization-mass spectrometry (ESI-MS)
Prior to native ESI-MS analysis, CP12 without the histidine tag was dialyzed against 200 mM ammonium acetate buffer (pH 8.0) using 5 kDa cut-off concentrator columns (Spin-X, Corning). Experiments at 5 µM of CP12 were carried out on an electrospray Q-ToF mass spectrometer (Synapt G1 HDMS, Waters) using NanoLockSpray ionisation source with borosilicate emitter (NanoES spray capillaries, Thermo Scienti c). Optimized instrument parameters were as follows: source pressure 5.3 mbar, source temperature 20 °C, capillary voltage 1.8 kV, sampling cone voltage 180 V, extractor cone voltage 4 V, trap collision energy 30 V and transfer collision energy 20 V. Mass spectrometer was calibrated in positive mode from 1000-5000 m/z with CsI (1 mg/mL) just prior acquisition.
Size exclusion chromatography 500 µL of CP12 after histidine tag removal, at 0.5 mM were loaded on a Hiload Superdex 200 (prep grade 26/60) equilibrated with 150 mM NaCl, 50 mM sodium phosphate buffer at pH 7.5. The column was calibrated with seven globular proteins of different molecular mass: thyroglobulin, ferritin, alcohol dehydrogenase, conalbumin, ovalbumin andcarbonic anhydrase. Throughout elution, the absorbance at 280 nm was monitored to determine the presence of proteins. The fractions were collected, concentrated (spin X-UF ultra centrifugal concentrator, Corning, 5 kDa cut-off) and stored at -20 °C.
Nuclear magnetic resonance (NMR) 15 N labelled histidine tagged CP12 were produced using the enhanced M9 medium (protocol of the European Molecular Biology Laboratory, https://www.embl.de/pepcore/pepcore_services/protein_expression/ecoli/n15/index.html) and puri ed as described above. The nal sample was buffer exchanged in 50 mM sodium phosphate pH 6.5, 50 mM NaCl, 10% D 2 O with traces of sodium trimethylsilylpropanesulfonate (DSS) and at a nal protein concentration of 250 µM.
The data were recorded at 4 ºC and 15 ºC. Fast 1 H-15 N heteronuclear single quantum correlation (fHSQC [38]) spectra were recorded with a 1 H acquisition time of 243 ms, 15 N acquisition time of 42 ms and with 24 scans on a 600 MHz NMR spectrum equipped with a cryogenic probe (Bruker). Translational diffusion was measured using standard bi-polar stimulated echo experiment [39], with a diffusion delay (Δ) of 200 ms. Ten experiments were recorded in which the encoding and decoding pair of gradients are produced with squared 1.4 ms long gradients (δ) with strength (G) ranging from 2-98% of the maximum gradient strengths (5.1 G.mm 1 ). The data were processed using nmrPipe [40], plotted using Sparky [41]. The diffusion coe cient was calculated using Octave [42] from the integral of proton signals of the methyl side chains of the protein from 1.2 to 0.7 ppm. The linear dependency of the logarithm of the integral as a function of the gradient strength was used to determine D as follows: where I G is the integral as a function of the gradient strength, I 0 is the integral in the absence of gradient, γ H is the proton gyromagnetic ratio, and Δ, δ and G are de ned above. The hydrodynamic radius associated with the diffusion coe cient was determined using the Stokes-Einstein equation: where k B is the Boltzmann constant, T the temperature in Kelvin and η ( T ) the viscosity.
Small angle X ray scattering (SAXS) SAXS experiments were performed on SWING beamline at the SOLEIL synchrotron using the online HPLC size exclusion chromatography facilities [43]. The sample-to-detector (CCD Aviex) distance was set at 2 m, leading to scattering vectors (q = 4 π/λsinθ, where 2θ is the scattering angle and λ the wavelength, equal to 1.033 Å) ranging from 0.01 to 0.46 Å. 50 µL of His-tagged CP12 (11 mg/mL) were injected into a pre-equilibrated size exclusion chromatography column (Agilent Bio-SEC-3 300 Å) upstream of the measurement capillary at a temperature of 15 °C. Frames of 990 ms with dead-time of 10 ms were recorded throughout the elution, with 100 frames recorded at the very rst minutes of the elution to measure the buffer background. The protein concentration was monitored via the absorbance at 280 nm with an in situ spectrophotometer. The experiment was performed in 30 mM Tris, 50 mM NaCl, 2 mM EDTA, 1 mM Tris(2-carboxyethyl)phosphine (TCEP), pH 7.5.
Data reduction to absolute units and solvent subtraction were performed using FOXTROT, a dedicated inhouse application. The frames recorded during the elution peak were carefully compared with each other, and data corresponding to identical scattering pro les and radius of gyration (Rg) were averaged to increase the signal-to-noise ratio. Data analysis was performed using the ATSAS suite of software [44]. The Rg and forward scattering intensity I(0) were obtained via PRIMUS using the Guinier approximation up to q.Rg < 1.0, and the distance distribution function P(r) was obtained via GNOM. The molecular mass of CP12 was determined using the forward scattering intensity I(0) of the frame corresponding to the top of the peak, at the highest protein concentration as described in [45]. Ab initio models of 3D envelopes corresponding to the scattering curve were constructed using DAMMIF (10 runs) and GASBOR (5 runs) with P1 and P2 symmetry [44].

3D-modelling
3-D modelling of the coiled coil domain has been performed using Swiss-model (https://swissmodel.expasy.org/ [46]) on a sequence including 5 residues before and 6 residues after the

Expression of a CP12 protein in T. pseudonana
Chen et al. [26] reported a hypothetical protein of uncharacterized function that was overexpressed when T. pseudonana cells were limited by N, P or Si (identi cation: XP_002286772). This protein is identical to our previously identi ed CP12 protein that is expressed under low CO 2 condition [25]. Here we showed that this CP12 protein was present in dark-treated and light-treated T. pseudonana cells (Fig. 1). Similarly, the expression level was stable during all phases of growth (Fig. 2).
In silico analysis of CP12 sequences The sequence of this CP12 protein was aligned with other representative CP12 sequences from the angiosperms A. thaliana, Pisum sativum and Spinacia oleracea, the green alga C. reinhardtii, two red algae Cyanidioschyzon merolae and Galdieria sulphuraria, and the cyanobacterium S. elongatus (PCC 7942) (Fig. 3). Beside the highly conserved AWD_VEEL motif, one pair of cysteine residues at the Cterminus, another hallmark for CP12s, was also present. In contrast, the highly conserved proline positioned at the center of the 8-residues linker between these two cysteine residues is absent in CP12 from T. pseudonana. In the nuclear genome of other diatoms, Phaeodactylum tricornutum, Thalassiosira oceanica, Fragilariopsis cylindrus and Pseudo-nitzschia multistriata, a gene encoding a protein with high similarity with this CP12 protein was also found (Fig. 3). Furthermore, all these newly identi ed CP12 proteins from diatoms were predicted to be addressed to the chloroplast, in good agreement with the known localization of CP12 in other well-studied organisms.
In silico characterization of the propensity of disorder Since CP12 proteins from other organisms are IDPs [9,11,47], we checked if this was the case for CP12 from T. pseudonana. The amino acid composition of this CP12 and that from C. reinhardtii, a well-known IDP, were compared to globular proteins. In the sequences of CP12 from C. reinhardtii and T. pseudonana, alanine and charged residues that promote disorder such as aspartate, glutamate and lysine residues (D, E and K), are more abundant than in globular structured proteins (Fig. 4). In contrast, order-promoting residues such as tryptophan, phenylalanine and tyrosine residues (W, F and Y) are less abundant than in structured proteins. In CP12 from T. pseudonana, cysteine residues are less abundant (two cysteine residues) than in the green algal CP12 (four cysteine residues). Titration with the Ellman's reagent, DTNB, of two thiol groups in our recombinant CP12 from T. pseudonana showed that its two cysteine residues were not involved in a disul de bridge, even under the atmospheric aerobic conditions. It has an isoelectric point of 4.5 like many proteins, and interestingly, a very high negative net charge − 14.6 strengthening the hypothesis that this protein may be an IDP.
The disorder propensity of CP12 from T. pseudonana was predicted using several algorithms. PONDR predicted a long disordered segment between residues 81-146, whereas the other predictors were more stringent (Fig. 5), in particular IUPred2A which predicts only a small number of disordered regions. Taking all predictions together, four regions of disorder were consensually predicted which encompass residues 30-42, 77-93, 108-117 and 129-139 (shaded residues in Fig. 5). Apart from the disordered regions, the algorithm s2D predicts the other regions to have a high propensity to form helices.

Experimental characterization of the propensity of disorder
Abnormal migration is often observed in SDS-PAGE for IDPs, with an apparent molecular mass higher than the theoretical one. The theoretical molecular mass of a monomer of CP12 from T. pseudonana is 17.6 kDa but it migrated as a protein of about 25 kDa in SDS-PAGE (Fig. 6). This abnormal migration resembles that of CP12 from C. reinhardtii where the molecular mass of the monomer with a histidine tag is ~ 11 kDa but migrates at about 20 kDa. Another key characteristic of IDPs is their thermal resistance; CP12 from T. pseudonana, like that from C. reinhardtii, was found in the supernatant after heat treatment at 95 °C. This indicates that even at high temperature this protein does not precipitate nor aggregate (Fig. 6).

Oligomeric state of CP12
Oligomeric state of CP12 was investigated using native ESI-MS. Monomer and dimer oligomeric states were observed (Fig. 7A). Experimental deconvoluted values of 17 642.1 (± 0.5) Da and 35 285.3 (± 0.7) Da t well with theoretical values for the monomer and dimer species, 17 641.8 Da and 35 283.6 Da respectively. In parallel, size exclusion chromatography showed a single elution peak (Fig. 7B). The homogenous peak observed under size exclusion chromatography corresponds to a unique oligomeric state in solution (Fig. 7B). It is likely that during the ionization of the molecular complex, the dimer dissociates partially, indicating that it is not a covalent dimer. The discrepancy between the molecular mass determined by ESI-MS for the dimeric state and the size exclusion chromatography elution volume of CP12 (corresponding to an apparent molecular mass of 93 (± 4) kDa) is consistent with CP12 being an extended dimer.

CP12 is a dynamic extended dimer
The hydrodynamic size of CP12 in solution was con rmed by measuring the translational diffusion coe cient using DOSY-NMR (Fig. 8A). The translational diffusion coe cient was 3.8 (± 0.1) 10 -11 m 2 .s -1 at 4 ºC, which corresponds to a hydrodynamic radius (Rh) of 3.4 (± 1) nm. For a folded dimer of 35.3 kDa, the theoretical Rh is 2.75 nm, while for an unfolded dimer it is 5.94 nm [48]. Therefore, the experimental Rh of 3.4 nm represents an intermediate between that expected for a fully disordered and a fully ordered dimer and is also compatible with an extended dimer. 1 H-15 N fast-HSQC spectrum of CP12 presents some characteristic features of that of a disordered protein with the absence of 1 H chemical shift dispersion (Fig. 8B). 34 resonances present narrow linewidths compatible with a purely disordered region. However, in contrast to true random-coil protein such as CP12 from C. reinhardtii (Fig. 8C, [9]), the linewidths of the observed resonance varied from narrow (< 20 Hz) to broad (> 60 Hz), indicating the presence of an intermediate chemical exchange on the NMR timescale. This result showed that reduced CP12 is highly dynamic on a range of timescale from ps to ms and does not possess a stable secondary structure.
We then performed SAXS experiment on CP12 to assess the overall structure and oligomeric state of disordered or partly disordered proteins (Fig. 9) [45]. The Guinier plot of the average spectrum was linear with a slight increase at very small q (q < 0.14 Å −1 ), indicating the presence of very small traces of aggregates that were not detected by the absorbance at 280 nm (Fig. 9A). The molecular mass inferred from the forward scattering intensity is 41.5 (± 3.0) kDa, con rming that CP12 is dimeric, as the theoretical molecular mass of dimeric histidine-tagged CP12 is 39 kDa. The inferred radius of gyration of CP12 was 38.2 (± 0.4) Å. The maximum dimension of the protein was 135 (± 5) Å, indicating that the protein is elongated. The normalized Kratky plot exhibited a large peak at (q.Rg) = 3.4 and (q.Rg) 2 .I(q)/I(0) = 1.7, followed by a decrease and then an increase of the curve, and is typical of a protein containing both globular folded domain(s) and disordered region(s) (Fig. 9A) [45]. Determination of the envelope of CP12 gave similar results using DAMMIF or GASBOR, applying a P1 or P2 symmetry, with excellent ts to the data (χ 2 = 1.61-1.66 with DAMMIF, and 1.39-2.39 with GASBOR, Fig. 9B). The envelope is very elongated and forms a cylinder of ~ 20-25 Å diameter, with several kinks which may re ect the limits between different domains (Fig. 9C). This is also consistent with the presence of several disordered regions as previously mentioned (Fig. 5).
Secondary structure of CP12 Circular dichroism (CD) was used to determine the secondary structure of CP12. The CD spectrum had two minima at 208 and 222 nm (Fig. 10A), implying a high content of α helices (50%), plus 10% of βsheets and 14% of random coil. This result is consistent with the prediction of both disordered and helical regions using the s2D predictor. To investigate further the conformation of CP12, the solvent 2,2,2tri uoroethanol (TFE), known to stabilize helical structures, was added. The CD spectra with increasing TFE concentrations showed an increase of the content of α helices up to 79%, as revealed by lower values of ellipticity at 208 and 222 nm. We observed an isosbestic point at 201-202 nm, indicating the existence of two conformers in equilibrium (Fig. 10A). The presence of unstable helical structures in a dimeric and elongated protein is reminiscent of a coiled coil propensity, we thus used a predictor for coiled coil regions and found that the region encompassing residues 46-82 (Fig. 10B) has a very high propensity for coiled coil arrangement. Furthermore, 3-D modelling of this domain using Swiss-Model yielded a dimeric coiled coil structure that perfectly ts inside the SAXS envelope (Fig. 9C). The propensity to form coiled coil structures might be a characteristic shared with other CP12 proteins, and at least CP12 from C. reinhardtii has a weak propensity to form coiled coil in the region encompassing the AWD_VEEL motif (Fig. 10C). These in silico predictions are consistent with our experimental data; in particular the SAXS results are consistent with a putative central coiled coil domain anked by other elongated domains likely to be disordered and/or possibly containing short structural elements.

Discussion
CP12 proteins from different organisms are highly diverse at the protein sequence level [2] but have some characteristic features such as the AWD_VEEL motif and often a pair of cysteine residues, at the C terminus, separated by eight residues encompassing a proline residue. The CP12 proteins from C. reinhardtii, A. thaliana and S. elongatus have been extensively studied and shown to belong to the IDP family. IDPs, or in other words ductile, dancing, malleable or exible proteins [49], are common in various proteomes [50][51][52][53][54][55] and occupy a unique structural and functional niche in which function is directly linked to structural disorder [47,[56][57][58]. The expression of CP12 from the diatom T. pseudonana is constitutively expressed under dark and light conditions, as well as during the growth, unlike CP12 from A. thaliana that is co-expressed with PRK and GAPDH in the light [59]. This indicates that CP12 may have other functions in diatoms compared to viridiplantae. Indeed, other studies have shown that it is upregulated under stress-related conditions in T. pseudonana (nutrient deprivation [26], low CO 2 [25]).
Like CP12 from other organisms [7,11,50], CP12 from T. pseudonana is characterized by several features of IDPs, with a lower proportion of order-promoting residues and a higher proportion of charged residues than structured proteins. This high net charge and low overall hydrophobicity is concordant with its abnormal migration on SDS-PAGE, its heat-resistance capacity, the absence of rigid globular core as observed using NMR and with its high exibility on the NMR-timescale. Another IDP characteristic is the ability to be involved in the formation of macromolecular complex, and this is the case for CP12 proteins from other organisms that bind to GAPDH and PRK [7,8,16,[60][61][62]. In contrast, the PRK-GAPDH-CP12 ternary complex has not been found in diatoms. This was attributed to the absence of cysteine residues on PRK in diatoms [22] but may also be the consequences of speci c features of this CP12. Moreover, in a freshwater diatom, Asterionella formosa, GAPDH interacts with the ferredoxin-NADP reductase (FNR), from the primary phase of photosynthesis, and a small protein identi ed as a CP12 [63]. Diatom chloroplasts lack the oxidative pentose phosphate pathway, the main NADPH generating source in the dark. In the ternary complex GAPDH-CP12-FNR, GAPDH, a main NADPH consumer, is inhibited [20], thereby releasing the pressure on NADPH availability for other metabolic pathways.
While CP12 from T. pseudonana is predicted to contain at least a long disordered region, the proportion of α-helices in CP12 from T. pseudonana (50%) is much higher than in CP12 from C. reinhardtii [10] (30%) and in canonical IDPs. In addition, SAXS and native ESI-MS showed that CP12 from T. pseudonana is dimeric unlike CP12 from other species where it is monomeric in its free isolated state [9][10][11]. Dimeric oligomerization of very dynamic proteins is mediated by numerous but transient inter-molecular interactions, and this is a feature of peculiar IDPs that form coiled coil arrangement [64][65][66], but not exclusively [67,68]. Indeed, CP12 from T. pseudonana was predicted to form α-helical coiled coils with a high probability, and this is consistent with the extended hydrodynamic radius experimentally observed by gel ltration and DOSY-NMR, and with the global shape derived from the SAXS data. Coiled coil is an omnipresent protein fold, accounting for about 3 to 10% of all protein-coding regions across all genomes [69][70][71]. Proteins having coiled coils can be regarded as a division of IDPs, contributing to an evergrowing list of functional contexts [64][65][66] and coiled coils are one of the most ubiquitous protein-protein interaction motifs [70]. Although CP12 from C. reinhardtii is not predicted to form coiled coil structures with the same high probability as T. pseudonana CP12, they both share a propensity to form unstable helices [10]. Such highly dynamic and adaptative biophysical properties is a common feature of moonlighting proteins [12].

Conclusion
The structural properties of CP12 from T. pseudonana, with putative dimeric coiled coil domain and disordered regions, suggest that it is not an alien as regard to other CP12s and might therefore have oneto-many functions. As the gene encoding this protein has also been found in other diatoms, it could be another facet of the enigmatic regulation of diatom metabolism [19,23,72]. These ndings extend the context for dynamic and coiled coil proteins related to their functions in photosynthesis regulation and stress in diatoms.