- Open Access
Orchestration of signaling by structural disorder in class 1 cytokine receptors
Cell Communication and Signaling volume 18, Article number: 132 (2020)
Class 1 cytokine receptors (C1CRs) are single-pass transmembrane proteins responsible for transmitting signals between the outside and the inside of cells. Remarkably, they orchestrate key biological processes such as proliferation, differentiation, immunity and growth through long disordered intracellular domains (ICDs), but without having intrinsic kinase activity. Despite these key roles, their characteristics remain rudimentarily understood.
The current paper asks the question of why disorder has evolved to govern signaling of C1CRs by reviewing the literature in combination with new sequence and biophysical analyses of chain properties across the family.
We uncover that the C1CR-ICDs are fully disordered and brimming with SLiMs. Many of these short linear motifs (SLiMs) are overlapping, jointly signifying a complex regulation of interactions, including network rewiring by isoforms. The C1CR-ICDs have unique properties that distinguish them from most IDPs and we forward the perception that the C1CR-ICDs are far from simple strings with constitutively bound kinases. Rather, they carry both organizational and operational features left uncovered within their disorder, including mechanisms and complexities of regulatory functions.
Critically, the understanding of the fascinating ability of these long, completely disordered chains to orchestrate complex cellular signaling pathways is still in its infancy, and we urge a perceptional shift away from the current simplistic view towards uncovering their full functionalities and potential.
The sequencing of the human genome and key observations from earlier research [1, 2], spurred the recognition of proteins and protein regions functioning without having three-dimensional folds. These intrinsically disordered proteins (IDPs) and regions (IDRs, collectively here referred to as IDPs), constitute around 30–40% of the human proteome  and perform key cellular and highly regulated processes such as transcription, translation and signaling [4,5,6,7]. IDPs show distinct sequence characteristics with higher frequencies of Pro, Glu, Ser, Gln, Lys, Ala, and Gly, and lower frequencies of Val, Leu, Ile, Trp and Cys. Hence, their sequences have particular properties of low hydrophobicity and high charge , resulting in their intrinsic inability to fold into a single, well-defined structure. Instead, IDPs take on ensembles of almost isoenergetic states, although these are far from random. For example, they may harbor lowly populated secondary structures that are relevant to binding  and to tuning the disordered ensemble , and alteration in the populations of these elements by e.g. mutations may lead to promotion of pathological states . Interactions by IDPs are often mediated by short linear motifs (SLiMs), which are 2–12 residue sequence stretches that are typically recognized by patterns of conserved residues within an otherwise sparsely conserved sequence stretch . SLiMs may overlap, and due to the structural plasticity of disordered regions, transiently exposed motifs and binding interfaces thus provide a trait of multispecificity to an otherwise simple binding site. Furthermore, flanking residues outside a SLiM and post translational modifications (PTMs) can tune affinity and add regulatory properties assisting in the spatiotemporal orchestration of multiple binding events [13, 14]. Indeed, PTMs are frequent in IDPs, in particular phosphorylations  and ubiquitylation , impacting functionalities and regulatory potential in several different ways. Thus, PTMs allow IDPs to function in rheostatic regulation  which are graded quantitative responses, and may also drive the formation or disruption of membrane-less organelles through processes known as liquid-liquid phase separation (LLPS) [18, 19]. Finally, the disordered nature of IDPs allows them to exploit diverse binding mechanisms by which they can fold-upon-binding [20,21,22], but also form complexes in which structural disorder is maintained to different degrees . Here, the most extreme case is the formation of a completely disordered complex of functional relevance and extreme affinity .
The class 1 cytokine receptors
Structural disorder also exists in membrane proteins . Besides being located in longer, disordered loops connecting transmembrane helices, such as in the sodium proton exchangers  and β1-adrenergic receptor , disorder prevails preferentially on the intracellular side , with membrane proteins having disordered N- and C-terminal tails of various length . In fact, ~ 10% of the human membranome have disordered intracellular domains of > 100 residues classifying them as long disordered regions  and ~ 40% have disordered domains of > 30 residues . For the subgroup of single-pass membrane proteins, an analysis of > 350 human sequences found disorder to be concentrated in the cytoplasmic domains , confirmed in very early work on gliotactin, a single-pass transmembrane receptor involved in cell adhesion . An important family of single-pass transmembrane proteins with long disordered tails is the class 1 cytokine receptors (C1CRs) [32,33,34]. This family constitutes 40 members, which have been divided into five different groups based on their structural properties . They all share having a tripartite structure characterized by a folded, extracellular domain (ECD) of various sizes and complexities, a single transmembrane helix (TMD), and an intracellular domain (ICD), also of varying length. Recently, the proportions of the different domains were exemplified by the a three-dimensional structural model of the prolactin receptor (PRLR); the first full-length structure of any cytokine receptor , Fig. 1a. C1CRs are characterized by the presence of two conserved cysteine bridges in the membrane distal fibronectin type III domain (D1) of the ECD, a WSxWS motif in the membrane proximal domain (D2) of the ECD, and two conserved sequence motifs in their ICD, Box1 and Box2 [32, 36]. The modular structures of the ECDs of the 40 receptors are well known and described broadly (see e.g. [32, 33]). Of the 40 receptors, 29 have an ICD (Fig. 1c), which all lack globular domains and intrinsic kinase activity. Signaling therefore critically depends on associated kinases such as Janus kinase 1–3 (JAK1, JAK2, JAK3) and tyrosine kinase 2 (TYK2) , Src kinases and mitogen activated protein kinases (MAPKs) . Box1 is a proline-rich SLiM, onto which JAK are proposed to be constitutively bound [39,40,41]. Box2 is a less conserved region consisting of a sequence of hydrophobic residues, followed by negatively and then positively charged residues [42, 43]. The function of Box2 is unclear, but it is suggested to be important for efficient binding and activation of JAK1/2/3 and TYK2, possibly in cooperation with the region between Box1 and Box2 . Recently, structures of complexes between JAK1/2 and TYK2 and a fragment of a cytokine receptor ICD were solved, namely of the erythropoietin receptor (EPOR)  (Fig. 1b), and of two class II cytokine receptors ICDs from the interferon-λ receptor 1 (IFNLR1)  and interferon-α receptor chain 1 (IFNAR-1) , respectively. These complexes have revealed a common mode of interaction, where Box1 makes hydrophobic contacts to the FERM-domain of JAK1/2;TYK2 and a Glu of Box2 intercalates into the phospho-tyrosine binding pocket of the non-canonical SH2 domain of JAK1/2;TYK2, Fig. 1b. Src kinases are also suggested to bind to this region [47,48,49,50] most likely via a SH3-SH2 interaction.
C1CRs act in homo- and heteromeric dimers and oligomers, in some of which they share common receptor chains (IL-6Rβ (gp130), βc (IL-3Rβ) and γc (IL-2Rγ)). By binding the cytokine on the extracellular side, multiple common and receptor specific signaling events are initiated intracellularly with the main common pathways being activation of JAK/STAT, Src and MAPK signaling (for reviews on signaling, see e.g. [37, 38]). The receptors of the homodimeric group 1 (sometimes referred to as Type 1), constituting the EPOR, the growth hormone receptor (GHR), the thrombopoietin receptor (TPOR) and the PRLR, are considered to be structurally the simplest. This group has served as paradigmatic models of the family, and studies on the GHR have suggested that hormone binding leads to conformational changes in the receptor, including realignment and separation of the lower transmembrane domains, ultimately resulting in trans-phosphorylation of JAK2 and phosphorylation of the ICDs to initiate signaling [51, 52].
Isoforms of C1CRs
Alternative splicing has been suggested to fine-tune the functions of IDPs [53, 54] and in accordance with this, functionally relevant alternative splicing has been shown to occur primarily in IDPs [53,54,55,56]. Furthermore, a large-scale computational analysis of alternative exons in human genes showed that alternatively spliced IDPs were highly enriched in interaction sites and regions modified by PTMs . This has been suggested to provide a mechanism to rewire interaction networks by including or eliminating specific interaction motifs or binding sites . Alternative splicing is further suggested to be important for maintaining tissue identity, as tissue-specific exons were shown to contain disordered regions enriched in PTMs and binding motifs that were central parts of tissue-specific interaction networks . Many of the C1CRs also exist in different isoforms, but there are currently no reports available on their properties or numbers, or how their characteristics differ. For the PRLR, there are to date nine isoforms identified in humans, with four differing in their ECD [57,58,59,60], out of which one is an ECD soluble isoform . The remaining five isoforms result from alternative mRNA splicing, and differ solely in length and amino acid composition of the ICD. These ICD isoforms constitute the predominant long form (LF, 622 residues), the intermediate form (IF, 349 residues), the short form 1a (SF1a, 376 residues), the short form 1b (SF1b, 288 residues) and the short form 1c (SF-1c, 309 residues), Fig. 1c. Compared to the LF, the IF, SF1a, SF1b and SF1c contain 13, 39, 3 and 24 unique C-terminal residues, respectively. The LF, IF, and SF1a all contain Box1 and Box2 motifs, while SF1b only contains Box1 [62,63,64]. Since activation of the JAK2/STAT5 pathway requires C-terminal tyrosine phosphorylation in the PRLR-ICD , this pathway is only activated by the LF due to lack of pYxxQ docking site motifs in the others [62, 64, 66, 67]. In contrast, the MAPK pathway is activated by the LF, SF1a, and SF1b [66, 68, 69], while the PI3K pathway is activated by both the LF [70, 71] as well as the short isoform in mice . Heterodimerization of the LF with either of the shorter isoforms has been shown to inhibit activation of the JAK2/STAT5 pathway [62, 63]. The shortest isoform, PRLR-SF1b, has been shown to negatively regulate PRLR signaling . Hence, isoform ratios need to be delicately balanced, and an increased ratio of short- to long forms has implications in breast-  and prostate  cancers. Lastly, since the different PRLR isoforms are expressed to different extents in different cell types and under different conditions [62,63,64], alternative splicing adds layers of regulation to PRLR signaling.
Structural biology of C1CRs
Many C1CRs have been subjected to structural investigations, but remarkably, > 95% of the available structural information comes from the ECDs alone or in complexes with cytokine ligands. Furthermore, most ECD structures have been solved by X-ray crystallography, not providing information on dynamical parts. The understanding of the structure and function of single-pass TMDs is generally lacking behind . For the C1CRs, structures of the TMDs have been solved almost exclusively on group 1 members, i.e. PRLR , EPOR [75, 76] and GHR  using detergent solubilized peptides and nuclear magnetic resonance (NMR) spectroscopy in both monomeric and dimeric states. However, monomeric structures have also been solved of the common receptor βc in bicelle membrane mimetics . Still, no structural information is available for almost one fourth of the receptors, including the TPOR, interleukin (IL)-31Rβ, IL-31Rα, IL-12Rβ1, IL-12Rβ2, IL-27R, IL-9Rα, IL-11Rα, and the oncostatin-M-specific receptor (OSMR). Furthermore, only six structures are available of shorter fragments of ICDs in complex with signaling molecules, as exemplified by the juxtamembrane domain of the EPOR in complex with the FERM-SH2 domains of JAK2 , a 12-residue phospho-peptide from GHR-ICD in complex with SOCS2 , an 11-residue phospho-peptide of IL-4R-ICD with the phospho-tyrosine binding domain of the insulin receptor substrate (IRS)-1 , and an 8-residue peptide from IL-5Rα-ICD with the PDZ domain of syntenin , Fig. 1b. Drawn to scale on a representative ICD model (Fig. 1b), these binding sites take up only a minor part of the ICD, and leave a large area completely unexplored; not to mention the > 25 receptors for which no structures of any part of the ICD are available.
Using a combination of small-angle X-ray scattering (SAXS), NMR and circular dichroism (CD) spectroscopy we recently performed an extensive characterization of the ICD of GHR and PRLR, showing that the entire ICDs of the LFs of the human PRLR (PRLR-LF-ICD) and the human GHR (GHR-LF-ICD) are disordered, with only transiently populated helices [35, 82]. Moreover, the PRLR-LF-ICD interacts specifically with hallmark lipids of the inner membrane leaflet through three lipid interaction domains (LIDs), one overlapping with Box1 and Box2. The most membrane proximal of these was also identified in the GHR-LF-ICD suggesting similar roles in signaling. The only other reports on structural characterization of unbound ICDs of cytokine receptors available in the literature are on small synthetic peptides. One example is a 17-residue peptide from the IL-2Rβ-ICD covering a sorting signal (P285SKFFSQL292) and forming a type I β-turn . Another example is a short peptide from PRLR-ICD containing the Box1 sequence (I243FPPVPGP250). The latter work suggested the presence of cis-trans isomerization of Pro248, the third proline of Box1 , supported by a suggested interaction with cyclophilin A (CypA), which co-immunoprecipitated with the first 76 intracellular residues of PRLR . Besides for EPOR, PRLR and GHR, and the short 5–10 residue peptides from the IL-4R and IL-5Rα, the ICDs of the remaining 26 C1CRs of the family have not been studied at the atomic level, and only the PRLR-ICD and GHR-ICD have been studied in complete forms. This leaves a critical knowledge void inhibiting the understating of how C1CRs signal.
The present paper sets focus on the ICDs of the C1CRs and their structural disorder, and asks why disorder has evolved to manage versatility and fidelity in their signaling. We predict disorder in all human C1CRs and list all known isoforms that differ in the ICD. We analyze their primary structures, globally and locally, and, sequentially and experimentally analyze chain behavior identifying shared and unique characteristics across the family. We show that their sequences are brimming with SLiMs, conferring multispecificity to the chains. Instead of considering the ICDs as passive scaffolds for kinases, we put forward a more complex view of active orchestration via organizational and operational features left uncovered within their disorder.
Materials and methods
Proteins - expression and purification
Human PRLR-ICD236–396 was prepared as described in . PRLR-SF1b-ICD was produced as a GST-tagged fusion protein containing a thrombin cleavage site. PRLR-SF1b-ICD was purified essentially following the procedure for PRLR-LF-ICD  with the following modifications: After inoculation of cells from overnight cultures in LB-medium, cells were left growing at 37 °C until OD600 = 0.8. The cells were subsequently centrifuged gently for 15 min at 2500 x g at 4 °C, the supernatant discarded, and the cells gently resuspended by swirling in 500 mL 15N- or 13C- and 15N-labeled M9-minimal media (1.5 g KH2PO4, 3.75 g Na2HPO4•2H2O, 0.5 g NaCl, 1 mM MgSO4, 0.5 ml M2 trace solution, 2 g 13C α-D-Glucose, 0.5 g 15NH4Cl) with 100 μg/ml Amp. The cell suspension was transferred back into the 5 L Erlenmeyer flask and left growing at 37 °C for 45 min after which protein expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) for 3 h. The cells were harvested by centrifugation (20 min, 5000 x g, 4 °C) and stored at - 20 °C until thawed on ice and resuspended in 40 mL sonication buffer; 1x PBS (1.4 M NaCl, 27 mM KCl, 100 mM Na2HPO4, 18 mM KH2PO4, 10 x stock), 0.1% (v/v) Triton X-100 and one complete EDTA-free protease inhibitor cocktail tablet (Roche Diagnostics GmbH). The cells were sonicated on ice, using an UP400S Ultrasonic Processor, 4 × 30 s with 30 s rest between rounds at 90% amplitude. The cell extracts were centrifuged (25 min, 20,000 x g, 4 °C), the pellets discarded and the supernatants used for purification. The glutathione column (Glutathione Sepharose 4 Fast Flow, GE Healthcare) was prepared by washing with 20 column volumes (CVs) 1xPBS buffer pH 8 and the supernatant from sonication was incubated with the column material for 1 h at room temperature (RT) under gentle agitation. The column was washed with 20 CVs 1xPBS buffer pH 8 and GST-PRLR-SF1b-ICD was eluted using 10 CVs elution buffer (50 mM Tris-HCl pH 8, 10 mM reduced glutathione). The sample was dialyzed overnight at 4 °C under stirring against 1 L thrombin cleavage buffer (20 mM Tris-HCl, 150 mM NaCl pH 8.4) using a 6000–8000 MWCO dialysis membrane. One hundred units of thrombin (GE Healthcare) was added to the sample, which then incubated for 2 h at RT under gentle agitation. After cleavage, the sample was concentrated using centrifugal filters (3000 MWCO, Millipore) and applied to an analytical size exclusion column (Superdex 75 10/300 GL, GE Healthcare) equilibrated with 2 CVs of 50 mM Na2HPO4/NaH2PO4 pH 7.5, 150 mM NaCl and 0.1 mM DTT. The flowrate was 0.5 ml/min and PRLR-SF1b-ICD was eluted over 1.5 CV GHR-ICD-LF and PRLR-ICD-LF were preoduced as described in [35, 82].
Small unilamellar vesicles (SUVs) containing POPC/POPS and POPC were prepared as in .
Far-UV CD spectra were recorded on a Jasco-810 spectropolarimeter from 250 nm to 190 nm with a scan speed of 20 nm/min, bandwidth 1 nm, 2 s response time at 25 °C in a 1 mm quartz cuvette. Protein concentration was 19 μM in 10 mM NaH2PO4-NaOH, pH 7.4, 1 mM TCEP. The spectra were averaged over 10 scans with the corresponding spectrum of the buffer subtracted. The resulting spectra were smoothed using a fast Fourier transform, removing the highest frequencies in the spectrum.
For backbone assignment, 13C-15N-labeled PRLR-SF1b-ICD was concentrated to 500 μM in 20 mM Na2HPO4/NaH2PO4 pH 7.3 or 7 M urea pH 7.3 (native and denatured conditions, respectively). The samples were added 10% (v/v) D2O, 5 mM tris (2-carboxyethyl) phosphine (TCEP), and 0.5 mM 2,2-dimethyl-2-silapentane-5-sulfonic acid (DSS) for referencing in a total volume of 350 μl. The pH was adjusted to 7.3 if needed and the samples were transferred to 5 mm Shigemi tubes. All backbone spectra were recorded at 5 °C on Varian INOVA 750- or 800-MHz (1H) spectrometers and backbone assignment accomplished from analyses of 1H-15N-HSQC , HNCACB , CBCA (CO) NH , and HNCO  spectra. Free induction decays were transformed and visualized using NMRPipe  and analyzed using CcpNmr Analysis . Assignments were done manually. Transient secondary structure elements were identified by secondary chemical shifts (SCSs). These were calculated by subtracting the Cα and C′ chemical shifts for each residue in 7 M urea from those obtained in 20 mM Na2HPO4/NaH2PO4. The amount of transient α-helices was assessed as described . Two series of 1H-15N-HSQC  spectra were recorded on 15N-PRLR-SF1b-ICD at 5 °C to analyze the relaxation times. The spectra for PRLR-SF1b-ICD were recorded with delay times of 10–1000 ms (T1) and 10–250 ms (T2) with two triplicates in each series. The relaxation decays were fitted to single exponentials and the relaxation times were determined using the CcpNmr Analysis software . For CypA interaction studies, samples containing 100 μM 15N-labelled PRLR-SF1b-ICD or PRLR-LF-ICD with and without 85 μM human CypA (Sigma Aldrich) were prepared in 20 mM Na2HPO4/NaH2PO4 pH 7.3, 10% (v/v) D2O, 1 mM TCEP, 0.5 mM DSS. To accommodate slight differences in conditions of the different protein batches, all batches were thoroughly dialyzed against the same buffer before mixing the samples. 1H-15N-HSQC spectra were recorded on each sample at 5 °C and chemical shift changes were analyzed as combined amide chemical shift changes by δΔNH = ((δΔH)2 + (0.154*δΔN)2)1/2 .
Small angle X-ray scattering
SAXS data were collected at the EMBL beamline P12 at Petra III in Hamburg, Germany . SAXS data on PRLR-LF-ICD and GHR-LF-ICD (20 mM Na2HPO4/NaH2PO4 (pH 7.3), 10 times molar excess DTT) were collected at the PETRA III, P12 beamline (DESY synchrotron, Hamburg), following standard procedures. A series of different concentrations between 1 and 6 mg/mL were measured for each protein added 0, 75, 150 and 300 mM NaCl. The SAXS curves for PRLR-ICD-LF and GHR-ICD-LF were analyzed using the form-factor for a Gaussian random coil , together with a scaling factor for correcting the protein concentration and a constant to model the background. The fitting was done with the minimize method in optimization library in the scipy package for python 3 using the L-BFGS-B algorithm. Uncertainties in the fitting parameters were estimated from the diagonal of the inverse Hessian. Pair distance distribution functions were calculated using the BayesApp [96, 97] and using the GenApp server (https://genapp.rocks (visited on April 15, 2020)). In the case of PRLR-ICD-LF, a few data points were omitted at low-Q based on the Guinier analysis, but data from 0.0075 Å-1 and up was used in all cases. The Guinier analysis was also used to truncate the data for GHR-ICD so that only data above 0.0084 Å-1 was used. Due to the lower S/N, data above 0.3 Å-1 was also omitted from the analysis and the Lagrange multiplier was fixed to 1014, in order to ensure a stable solution with respect to data range and input parameters.
To calculate a reference Rg for PRLR-LF-ICD and GHR-LF-ICD, we used the power-law in Eq. 1, where R0 is a constant related to the persistence length of the polymer, N is the number of amino acids and ν is related to the nature of the polymer.
For R0 and ν, we used the experimental parameters from Kohn and co-workers , determined based on chemically unfolded proteins, i.e. R0 = 1.927 Å and ν = 0.598, where the latter value is very close to the theoretical value of 0.588 of self-avoiding polymer chains.
Disorder predictions, sequence alignments and bioinformatics
The original list of C1CRs was manually curated from the list of receptors in [33, 34], after which isoforms were extracted from Uniprot . Disorder predictions for each of the sequences were conducted as described previously , and the sequences were analyzed by IDDomainSpotter , CIDER  and Weblogo3.0  using standard setting and TMHMM was used to predict borders between domains . The amino acid propensities for each of the sequences were calculated relative to an ordered reference statistic, as (C(aa) - C_ref(aa)) / C_ref(aa), where C(aa) is the frequency for the individual residue and C_ref(aa) is the frequency of the residue in the reference set. We calculated our ordered reference frequencies, C_ref(aa), from entries in the MobiDB database , using the curated DB set and selecting all protein regions labeled as ‘S’. The same set was used to calculate the frequencies for the Disorder category in Fig. 2b, this time selecting regions labeled as ‘D’. To validate the sensitivity to this choice, we also calculated the frequencies from the larger Derived MobiDB set, with qualitatively similar results. Finally, we verified the impact of homology reduction on these frequencies by reweighting each protein by the number of other entries in its UniRef50 class, but again observed similar results, and homology reduction was therefore omitted for the remaining calculations. The error bars in Fig. 2b indicate the 90% confidence interval of the estimated frequencies, calculated using a per-protein bootstrapping procedure with 1000 iterations .
SLiMs and phosphorylation sites were predicted in the longest isoforms of group 1 receptors (PRLR-FL, GHR-FL, EPOR-FL and TPOR1). SLiMs were predicted using the ELM resource  and filtered based on taxonomic context (Homo sapiens) and cellular compartment (cytosol) with a probability cut-off of 100. Modification sites were excluded from analysis. The iGPS 1.0 algorithm  was used for prediction of in vivo phosphorylation sites. Only instances above a medium threshold  was included. Confirmed SliMs and phosphorylation sites were manually curated.
Results and discussion
Disorder in C1CRs
The ICDs of the C1CRs have been very sparsely studied structurally, likely because of their high expected abundance of disorder. Moreover, many of the receptors exist in several different alternatively spliced versions, some of which differ in the ICDs. To provide an overview of the ICD isoforms, we conducted a survey of known isoforms of the entire family (Fig. 1c and Table 1). We followed the grouping of the receptors made based on structure and evolution [33, 34], but excluded receptors absent in humans and/or for which no ICD could be annotated. Furthermore, IL-2Rα and IL-15Rα were excluded as they lack the structural hallmarks of the family and likely belong to a separate family . This left us with a total of 29 different receptors, distributed with four members in group I (single-chain homodimers), ten members in group II (the gp130 family, not counting receptors binding ancestral cytokines), two members in group III (soluble α-chains, leaving out four receptors without ICDs (IL-27Rβ, ciliary neurotrophic factor receptor subunit α (cntfRα), cytokine receptor-like factor 1 (CRLF1), and IL-12Rβ), six members in group IV (long-tailed receptor chains) and seven members in group V (short-tailed receptor chains), Table 1. The 29 receptors provided a total of 54 ICD isoforms distributed across all groups. Approximately 40% of the receptors (12 receptors; leukemia inhibitory factor receptor (LIFR), OSMR, IL-6Rβ (gp130), IL-27R, IL-6Rα, IL-2Rβ, IL21R, IL4Rα, βc (IL-3Rβ), CRLF2, γc (IL-2Rγ), IL-13Rα1, IL-13Rα2, and IL3Rα) only had one ICD isoform, and three of these were the common receptors IL-6Rβ (gp130), βc (IL-3Rβ) and γc (IL-2Rγ). For the remaining 16 receptors, up to five different isoforms could be identified and nine of the receptors with ICDs > 200 residues had a short isoform < 50 residues. A total of 16 isoforms had unique sequences, typically > 10 residues with an average length of 25 ± 16 residues, and the longest unique sequence of 67 residues belonged to the LEPR isoform C. The average length of the longest ICD isoforms was 188 residues (group I), 210 residues (group 2), 82 residues (group 3), 335 residues (group 4) and 65 residues (group 5) (Table 1). IL-4Rα had the longest ICD of 575 residues (Table 1). Thus, the C1CR-ICDs are generally long and some have isoforms with unique sequences of considerable length.
The characteristic compositional bias of IDPs makes it possible to predict the degree of disorder in proteins computationally . Almost 10 years ago, computational predictions of disorder was done for five of the C1CRs , but disorder predictors have since improved in quality and reproducibility , and no study has examined the entire family in unison. We therefore predicted the disorder profiles of the longest ICD isoforms of all 29 C1CRs as well as the propensity of regions to undergo folding-upon-binding using the ANCHOR scores  (Fig. 2a and SI Fig. S1). From these predictions, we observed that the ICDs of the entire family have high scores (> 0.5) for disorder along their complete sequence and none were predicted to harbor folded domains. Furthermore, almost all receptors had lower disorder scores in the juxtamembrane 20–50 residues, a region overlapping with the JAK1/2/3;TYK2 binding sites. Along the chains, regions of lower disorder propensity were observed, which at the same time were paralleled with high ANCHOR scores. Such signatures suggest the region to be prone to folding-upon-binding and thus constitutes a potential binding site . Indeed, the dip with the lowest disorder score in PRLR-LF-ICD occurred around residue 610, which corresponds to the region of tyrosine phosphorylation by JAK2 (Y580/Y614), and docking site for STAT5 (YLDP). Comparing the profiles across the group 1 C1CRs revealed a similar pattern of disorder along the first 150–200 residues, although the extent of each of the regions with higher/lower disorder vary (Fig. 2a). Similar group specific profiles of some similarity in the first half of the ICDs were seen for groups 2, 3, and 4, but not for group 5 (SI Fig. S1); an observation likely reflecting their shorter ICDs. Finally, we compared the disorder profiles for the five different isoforms of the PRLR-ICD (SI Fig. S2). Despite the change in sequence, all the ICD isoforms were predicted to be disordered and with almost identical disorder profiles. This is consistent with the general observation that sequence may change in a family of proteins while the disorder-order profile persists [110, 113].
In summary, the predicted disorder profiles support that the ICDs of all the C1CRs are disordered (Fig. 2b) and highlight common disorder profiles with a distribution of binding sites prone to folding-upon-binding.
The ICDs of the C1CRs have compositional biases distinguishing them from other IDPs
To address if the C1CR-ICDs have physiochemical properties that distinguish them from other IDPs, we compared the amino acid content of the entire family (Fig. 2b) as well as the individual groups (SI Fig. S3) to those of folded proteins and other IDPs (for details see methods). The analysis revealed that the C1CR-ICDs indeed have global sequence compositions that stand out from other IDPs in three ways: First, some amino acids are depleted in the C1CR-ICDs, namely Met, Arg, Ala and Lys, which are less frequent than in general in IDPs and in folded proteins. Second, Cys, Trp, Leu and Val are significantly more frequent in the C1CR-ICDs than in other IDPs, and are as frequent as in folded proteins (except Val, which is less frequent than in folded proteins). Third, Pro is highly enriched in the C1CR-ICDs, and is even more frequent than in both folded proteins and IDPs in general. These differences are remarkable, but the role of these global compositional biases in C1CR functionality remains to be understood. The depletion in positively charged amino acids could be related to prevention of detrimental interactions with the negatively charged inner membrane leaflet to which the C1CR-ICDs are tethered through their TMDs, or with other negatively charged molecules. The enriched Cys, Trp, Leu, Val and Pro are in IDPs often found in SLiMs. Indeed, the saturation of SLiMs along the C1CR-ICD chains, as highlighted in Fig. 3 (see below), suggests enrichment in binding sites and may reflect large interactomes. Pro are known to preserve disorder in regions of IDPs with residual structural propensities , and hence could counter-balance the effects of the enrichment in hydrophobic residues. Furthermore, the chemistry of Pro causes rigidification of the backbone and consequently conformational expansion, as well as the formation of polyPro type II (PPII) structures by Pro-rich motifs . Finally, several SLiMs and modification sites are Pro-based, including binding sites for JAKs and SH3s, and MAPK modification sites, which may increase the relative content of Pro in C1CR-ICDs.
Thus, even though C1CR-ICDs are classified as disordered from the disorder predictions, they have a remarkable compositional bias that distinguishes them from other IDPs, possibly due to SLiM enrichment.
Although sequence identity is often low among related IDPs, the sequence characteristics important for function are typically conserved, whether it being specific SLiMs, global conformational characteristics or specific functional domains . Thus, regions of specific residue biases can be taken to represent domains of different chemical and structural properties, which may contribute differently to the function of the C1CR-ICDs. To identify putative functional domains of specific physio-chemical properties across the C1CR-ICDs we submitted the sequences to IDDomainSpotter . IDDomainSpotter reveals distinct conformational biases in regions of long IDP sequences by calculating the fractions of specified residues in a sliding window of 15 residues, meaning, that for each residue k, the fraction of the specified residues between k-7 and and k + 7 is given. Here, we have analyzed e.g. the charge composition by setting Lys and Arg as positive contributions (+RK) and Glu and Asp (−DE) as negative contributors (net charge). Hence, a given residue k within the sliding window counts as + 1 if the position is a Lys or Arg, − 1 if the position is Glu or Asp, and 0 for any other residue.
The IDDomainSpotter analysis of the C1CR-ICDs revealed shared profiles for certain residues across the receptors (Fig. 2c), suggesting functional importance. First, they all shared a region of 10–20 residues in the region immediately following the TMD with a positive net charge, as typically observed for type I membrane proteins [30, 116]. This is followed by a region of ~ 50–60 residues (only ~ 20 residues for the shorter TPOR) with a negative net charge (Fig. 2c, green). We denominate these regions as the positive domain (PD) and the negative domain (ND), respectively. For the ICDs of PRLR and GHR, the PD has been shown to specifically interact with negatively charged lipids of the inner leaflet of the membrane . However, the role of the ND is not understood. The negative charges may be relevant for membrane repulsion, or for compaction with the PD when not membrane bound. Alternatively, it could provide negatively charged flanking regions for specific SLiMs, such as for Box2 binding to SH2 domains or Pro-rich motifs binding to SH3 domains, as recently supported by experiments . Second, ~ 100 residues and onwards from the TMD, the net charge was close to equally balanced along the chains. Another shared property is the almost equal distribution of the unusually abundant Cys and Pro throughout the chains (Fig. 2c, orange and purple). This could suggest that their abundance is related to global conformational properties, rather than e.g. interaction sites or PPIIs. The ICDs of group 1 further shared a pattern of depletion and enrichment (20–50 residues) of the hydrophobic branched amino acids Ile, Leu and Val throughout their chain (Fig. 2c, red). Such hydrophobic side chains are usually less abundant in IDPs because of the energetic penalty of solvation, and hence in IDPs they are often primarily located in e.g. SLiMs, or related to maintaining extended β-structures of relevance to binding. Finally, clustering of Phe, Tyr and Gly (+FYG) was analyzed, as IDRs enriched in these residues may be involved in liquid-liquid phase separation [118, 119], but no major clustering of these were present throughout the chains.
The patterns observed for the group 1 receptors is overall shared across the C1CR-ICDs, with the exception of some noteworthy variations in charge composition. Generally, all the C1CR-ICDs harbor a PD (sometimes in a shorter version), followed by an ND of various length, with subsequent close to net neutral charge along the chains. However, the short ICDs of group 3 only harbor a PD, and lack an ND. Furthermore, the ICDs of IL-31Rβ (OSMR) (group 2) and IL-4Rα (group 4) lack regions of substantial net charge throughout their ICDs, including the PD; a trait that may be related to their association with the IL-31Rα and IL-13Rα1, respectively.
Short linear motifs allow expansion of the interactome
In the disorder profiles and the IDDomainSpotter analysis we observed distinct patterns, which may suggest the presence of multiple binding sites  (Fig. 2). For all receptors, the first region of low disorder propensity corresponds to the juxtamembrane region containing the most conserved and well-known motifs, Box1 and Box2, involved in JAK/TYK binding. As a Pro-rich motif, Box1 represents one of the most abundant SLiMs in the eukaryotic proteome . The polyPro scaffold inherently provides a conformational bias towards PPII formation , which creates a structural predisposition that may drive an interaction via reduced entropic penalty of complex formation . Except for group 5, which lacks canonical Box1 and Box2 motifs, all receptors harbor the conserved PXP motif in Box1 (SI Fig. S4), known to interact with the JAK-FERM domain [44, 124]. However, in most receptors, Box1 is further extended to PXXPXP (consensus for group 1: φφPXφPXP, where φ is any hydrophobic residue), which thereby accommodates both the minimal SH3 binding motif (PXXP)  and the FERM binding motif (PXP) in one combined SLiM (Table 2). This enables competitive binding to Src and JAK kinases [48, 126]. Although Box2 is remarkedly less conserved than Box1, sequence alignments reveal glutamates to be most abundant in Box2 (Fig. 3a). This is in accordance with their essential role as phospho-mimics in binding to the atypical JAK-SH2 domain [44,45,46]. Studies suggest JAK kinase binding to be driven by Box1 association to the FERM domain , which thereby increases the local concentration of Box2 to the SH2 domains, further suggesting Box1 as the primary anchoring point.
In the membrane distal region, SLiMs constituting docking sites for various signaling proteins have been mapped experimentally. These SLiMs are predominantly activated by phosphorylation to recruit SH2-containing proteins, such as those in STAT and SOCS proteins [127, 128]. Each group of receptors is known to preferentially recruit a specific STAT for activation  and each group therefore contains a specific subset of phospho-tyrosine motifs. Hence, group 1 harbors the STAT5 consensus motif, pYXXL , whereas the group 2 harbors the STAT3 consensus motif pYXXQ . In addition to distinct down-stream signaling related SLiMs, many of the experimentally known SLiMs are also related to endocytosis, trafficking and degradation. Some of these are frequent and are well-described motifs experimentally, such as the dileucine-motifs (i.e. [D/E] XXX [L/I[L/I]] and [D/E]XXLL) seen in LIFR  and IL-2Rβ ), and the tyrosine-based motifs (i.e. YXXϕ), promoting clathrin-dependent endocytosis and internalization, first identified in the TPOR (YRRL) and later in the G-CSFR [134, 135]. Additionally, phosphorylation dependent degrons (i.e. DSGXXS) [136,137,138], promoting ubiquitin-dependent proteasomal degradation are also well characterized in both GHR and PRLR [136,137,138]. The motifs are summarized in Table 2.
For the longest ICD-isoforms of group 1, we subsequently predicted SLiMs and phosphorylation sites using the Eukaryotic Linear Motifs server (ELM)  and the iGPS , respectively (see methods), and mapped these to the sequences, marking those already experimentally confirmed (Fig. 3b). We made three important observations. First, the predicted SLiMs as well as their flanking regions are rich in amino acids that promote extended structures, such as Pro, Val and Leu , in accordance with the structures adapted in the bound states [79, 140, 141], and the compositional analysis made above. Second, it was evident that clusters of overlapping SLiMs are frequent and distributed across the ICD sequences, interleaved with stretches depleted in SLiMs (Fig. 3b). Clusters of overlapping SLiMs may be scaffolding hot spots where multiple binding events can take place in a controlled manner, largely determined by binding competition, i.e. affinity, concentration and PTMs. However, IDPs may also accommodate simultaneous binding of several partners and thereby orchestrate signaling by bringing relevant proteins into close proximity . As evident from Table 2, similarities between the tyrosine-based motifs are pronounced. Consequently, STAT and SOCS binding motifs may overlap, but also with phosphatase binding sites, such as e.g. for SHP2 [143,144,145], as well as with tyrosine-based internalization motifs. Thus, regulation of signaling fate by discrimination and availability via compressed motifs appear widespread in C1CRs and is critically linked to properties of disorder. Until recently, C1CR-regions with overlapping motifs have exclusively been characterized in the membrane distal regions. However, accumulating evidence suggest that also the membrane proximal regions contain SLiM-clusters. In GHR, the membrane proximal ~ 60 residues region contains a LID with an unknown function , a ubiquitin dependent degron whereto the E3-ligase βTrCP docks and promote GHR downregulation , as well as JAK2  and Src kinase (LYN) binding sites  (Fig. 3c). JAK2 and LYN are the primary kinases in GHR signaling, controlling the activation of JAK2/STAT5 and MAPK pathways, respectively . Both are known to be constitutively associated with the receptor ICD [48, 146] and their relative activation of pathways can be perturbed by mutations in the ECD affecting TMD alignment . However, the molecular details of how the change in TMD alignment is associated with pathway selectivity are still unknown, but may be controlled by competitive binding of JAK2 and LYN and even further affected by membrane interaction. Similarly, GHR downregulation by βTrCP may likewise be driven by competition in binding. Thus, this region represents one of the essential composite SLiM-clusters in GHR with hitherto unexplored implications for the regulation of GHR signaling.
Typically, multiple binding events in IDPs are regulated by phosphorylations and can be characterized as binary on/off switches. However, accumulating evidence have revealed that phosphorylations can generate much more complex responses (reviewed in ), and multisite phosphorylation can additionally generate sensitive threshold responses as well as graded responses. The third important observation we made was that in the group 1 C1CRs, several phosphorylation sites were predicted, but only a small subset of these were well-characterized experimentally. Hence, much remains to be understood in terms of regulation and the many modification sites open the possibility that the C1CR-ICDs can have rheostat regulatory potential . In this way, successive phosphorylations may additively increase (or decrease) binding affinity enabling graded responses, or they may modulate the conformational ensemble, with impact on signaling output. Importantly, multisite phosphorylations, which are functionally relevant for IDPs, remains to be addressed in the C1CRs.
In summary, interactions by C1CR-ICDs are primarily mediated by SLiMs, creating docking and modification sites for several accessory signaling proteins. Furthermore, clusters of overlapping SLiMs dominate the C1CR-ICDs, which together with the structural plasticity provided by the disorder properties, impart a unique condensed and versatile signaling scaffold, enabling establishment of large interactomes, whose content is controlled by the available pool and concentrations of interaction partners as well as PTMs. The spatio-temporal orchestration of signaling therefore rely on availability of the binding partner, affinities and kinetics and altogether eventually determine signaling fate.
Network rewiring by isoforms
In order to investigate different C1CR-ICD isoforms at the molecular level, we took two approaches. First, we compared a long and a short isoform experimentally, using the PRLR as model, and second, we predicted and compared the presence of SLiMs for those C1CRs, which have longer isoforms that differ in the ICD sequence (i.e. having isoforms of unique sequence, not just truncations).
First, we expressed and biophysically characterized PRLR-SF1b (residues 259–288) and compared it to PRLR-LF-ICD. PRLR-SF1b-ICD is much shorter (32 versus 386 residues) and differs only in the last three C-terminal residues, where K286, G287 and K288 of PRLR-LF-ICD are substituted by V286, T287, and P288 (P288 is the new C-terminus). Apart from the loss of multiple interaction sites by being shorter, including the loss of Box2, the chemical change from net positively charged to uncharged with more hydrophobic residues may influence the structural preferences as well as the interactome, especially membrane binding. From detailed NMR analyses, the PRLR-SF1b-ICD maintained structural disorder and dynamics (SI Fig. 5a-c), but transient helix 1, observed in the PRLR-LF-ICD, was eliminated in the PRLR-SF1b-ICD as seen by smaller secondary chemical shifts (SCSs) of Cα nuclei (Fig. 4a and SI Fig. 5c-d). This demonstrates that two isoforms differing only in three residues, may have different structural propensities (Fig. 4a). Functionally, however, and despite structural and chemical changes, membrane interaction using small-unilamellar vesicles of POPC:POPS (3:1) as membrane mimetics, previously observed for PRLR-LF-ICD , was preserved in PRLR-SF1b-ICD with loss of NMR signal intensities and chemical shift perturbations (Fig. 4b,c). Box1 was previously suggested to interact with cyclophilin A (CypA) , and Box1 proline cis/trans isomerization claimed important for this interaction . However, form NMR chemical shift analyses (SI Table S1), Pro cis/trans isomerization appeared not to be dominant in the free state of neither isoforms, and accordingly none of them interacted with CypA (SI Fig. 5f-i). Thus, despite sequence and structure differences between the two isoforms, functionality was maintained in the short isoform. In this case, shortening of the ICD by removing 90% of it, including numerous SLiMs and phosphorylation sites, as well as docking sites for STAT and SOCS, in this case only resulted in a major reduction in the interactome.
Shorter isoforms as described for PRLR exist for 9 of the C1CRs. However, many ICD isoforms have longer regions of unique sequence, which differ from the canonical isoform. Thus, to explore if these longer ICD isoforms may have gained new interactions sites, we predicted which common SLiMs were gained or lost, disregarding potential phosphorylation sites and receptor-unique SLiMs (Table S2). The unique sequences were found to carry distinct SLiMs. In the case of group 1, a 14–3-3 binding SLiM has previously been identified in PRLR isoform 1 ; a SLiM originally discovered active in cytokine receptors in the IL-9R . However, predictions suggested the presence of a different 14–3-3 motif, just six residues C-terminal to the experimentally described site (Fig. 4d, SI Table S2). Compared to the PRLR-LF-ICD, the short forms did not possess the STAT-docking sites or the 14–3-3 binding SLiM. Instead, a different 14–3-3 binding SLiM each with different sequence properties was predicted in the unique sequences (SF1a and SF1c) (Fig. 4d). For TPOR, isoform 2 had two 14–3-3 SLiMs, which were absent in isoform 1, while for the GHR and EPOR, which do not have any known 14–3-3 binding SLiMs, the isoforms with unique sequences (isoform 2 and 3 of GHR and isoform T of EPOR) were much shorter, and without any predictable common or different SLiMs. Thus, for PRLR, the preservation of the 14–3-3 SLiM despite changes in sequence suggest a key regulatory function, one of which may be to attenuate receptor signaling as suggested . Relevantly, the LFs of EPOR, GHR and PRLR all had a phosphorylation dependent degron, interacting with the SkpSCF-betaTrCP1 complex or the Skp1_Cullin-Fbox, leading to ubiquitylation and degradation, as shown experimentally for PRLR and GHR, where it negatively regulates receptor stability [136, 137]. These were not identified in any of the shorter isoforms, which also have been seen to be stabilized on the membrane ; a possible result of the lack of associated proliferative signaling and hence lack of need for immediate down regulation.
For other receptors, e.g. the IL-31Rα (GLMR) and LEPR isoforms, we found that unique sequences introduced PDZ binding SLiMs at the new C-termini (Fig. 4e). Furthermore, when the PDZ motif is present, each IL-31Rα (GLMR) receptor isoform had a unique PDZ-binding SLiM (expect for isoforms 3 and 5, which are identical in their ICDs), allocating the isoforms to interact with different classes (Class 1,2 and 3) of PDZ domains (Fig. 4e) [152, 153]. The same was true for the LEPR, where each isoform has a unique PDZ binding SLiM (Fig. 4f). In fact, the introduction of a PDZ SLiM in the C-terminus in one isoform was observed for several receptors including G-CSFR, IL-7Rα, IL-9Rα, and GM-CSFRα (Table S2). Why these isoforms need PDZ binding motifs is not clear, but several scaffolding proteins with specialized subcellular localization and tissue specificity exist, known to contain multiple PDZ domains by which they orchestrate supramolecular complexes. Binding of the IL-5Rα-ICD to a PDZ domain from syntenin (Fig. 1 b)  supports the involvement of further scaffolding proteins for formation of larger signaling complexes. PDZ containing protein may be of relevance to the C1CRs and could engage proteins from the NHERF and PSD-95 families , which also scaffold kinases as Fyn . Alternatively, E3-ligases belonging to the MARCHs family coordinate binding via PDZ domains and are relevant for ubiquitylation of proteins in the intracellular membranes . However, besides the complex between the IL-5Rα-ICD and the PDZ domain from syntenin, complexes of C1CRs with PDZ domains remain to be experimentally explored. Finally, for all receptors with isoforms, the longest isoform, except for GM-SCFRα, carries the interaction with STATs, either STAT5 or STAT3 or both, but, additionally also carry a binding motif for TNF receptor-associated factor (TRAF)-2 or TRAF-6, none of which are found in other, shorter isoforms. In a few cases, the STAT and/or TRAF motifs are maintained in the second longest isoform, and sometimes a shift between STAT5 and STAT3 or between TRAF-2 and TRAF-6 occurs.
Thus, for the C1CRs, the disorder predictions and experimental characterization of selected representatives have suggested that the isoforms maintain structural disorder, and their presence suggests several mechanisms by which disorder orchestrates signaling. The first is the complete removal of a large part of the ICD, eliminating SLiMs important for STAT activation, TRAF interaction and downregulation by degradation via degron activation. In this way the shorter isoforms act as negative regulators, or decoy receptors, of signaling, as seen for the short forms of the PRLR and GHR [62, 151]. However, these isoforms still maintain binding capacity as seen from for membrane binding of PRLR-SF1b above. The second mechanism by which isoforms orchestrate signaling is via rewiring of the interactome to access completely new networks, exemplified by the addition and removal of binding sites for e.g. 14–3-3 proteins and PDZ domains. This allows for different signaling profiles dependent on expression profiles of the C1CR-ICD isoforms. However, more studies into network rewiring of the C1CRs are warranted, and the analysis made here merely provides a starting point.
The conformational ensembles of C1CR-ICDs
IDPs are functional without taking on a single, well-defined tertiary structure. Yet, they cannot adequately be described as simple statistical coil chains equally populating all possible conformations allowed by their backbone torsion angles. Instead, IDPs display varying degrees of compaction and elongations, and contain transient, short- and long-range structural organizations. Hence, the disorder of the C1CR-ICDs not only infer flexibility and high accessibility of binding sites, but certain chain dimensions and spatial organizations may influence the organization of the signaling complexes and orchestration of protein interactions, and in the end, signaling outcome. Currently, the conformation and dimensions of IDPs cannot be quantitatively predicted from sequence [157, 158]. Nonetheless, the balance between chain-chain and chain-solvent interactions that determines the conformational preference is related to specific sequence features that influence the conformational ensembles in predictable ways [101, 157, 159, 160]. One set of these relates to global compositional sequence features (i.e. parameters that are independent on the sequence order), and the fraction of charged residues and the net charge per residue are particularly important [159, 161]. In addition, features relating to sequence patterning, especially the patterning of oppositely charged residues and expansion promoting residues, influence compaction . However, the current difficulties in consistently predicting the conformational ensemble of all IDPs reflects that some of these behaviors are encoded in sequence features yet to be unraveled.
IDPs have been classified into five compositional groups in a diagram of states  based on their fraction of positively charged residues (f+) and fraction of negatively charged residues (f−). These two global parameters are combined into two measures underlying a diagram of states: the fraction of charged residues (FCR = f+ + f−) and the net charge per residue (NCPR = f+ − f−). An explanation of the relation between these parameters and the properties of the chain is given in the supplemental data. Of 879 IDRs longer than 15 residues found in DisProt, CIDER  classified 40% as belonging to R1, 35% to R2, 22% to R3, and 3% to either R4 or R5 . For each C1CR-ICD, the sequence of isoform 1 was submitted to CIDER , except for LEPR and G-CSFR, for which isoform B and 3, respectively, were selected as these were the longest isoforms (see Table 1). For GM-SCFRα, both isoform 1 and 2 were analyzed because they differed in more than 50% of their C-terminal sequences (see Table 1). The C1CR-ICDs generally fell close to the boundary between R1 and R2, with most belonging to R1 (61%) (Fig. 2a), suggesting a preference for compact, but still dynamic, heterogenous conformational ensembles . Nonetheless, in particular for sequences belonging to R2, their overall charge neutrality means that their conformational preference cannot be predicted from global composition alone [157, 159]. Furthermore, it should be noted that the boundary between R1 and R2 has been determined ad hoc, and has been suggested to be positioned at lower FCR for longer sequences [157, 159]. Furthermore, for ICDs > 100 residues or with a high proline fraction (> 0.15), no qualitative prediction of the conformation can be made for sequences of R1, as these tend to have more extended conformations than their scores predict.
Since almost all the C1CR-ICDs are long IDPs of R1 and R2, the conformational preferences cannot be predicted from global composition alone but may also be influenced by e.g. sequence patterning. Particular the patterning, or mixing, of oppositely charged residues is important, as well as expansion driving- and aromatic residues. The parameter κ reports on how well positively and negatively charged residues are segregated across the sequence and is normalized between 0 and 1, with κ close to zero representing sequences with evenly distributed charges, while sequences with κ close to 1 have highly segregated charges. It has been shown that as κ approaches 1, the conformational ensemble becomes more compact . However, since κ is calculated by normalizing to the most segregated sequence within the given composition, a specific κ value will not have the same meaning for two sequences with different FCR and |NCPR| values. Furthermore, for long IDRs, such as most of the C1CR-ICDs, κ is calculated only within a window of 5 and 6 residues, ignoring long-range effects. κ is most informative for sequences with an FCR above 0.25 and NCPR between − 0.1 and + 0.1, for which a κ below 0.12 is considered low and a κ above 0.25 is high. Especially for polyampholytic sequences with an FCR beyond 0.4, charge patterning is predicted to have a major impact on the conformation . There is one such example, namely the GM-CSFRα, which has an FCR of 0.41, an NCPR of − 0.07 and a κ of 0.25, suggesting chain compaction.
The position of the far majority (94%) of the C1CR-ICDs in R1 and R2 is a consequence of their low net charges. Their FCR values are in the mediocre range of 0.1 < FCR < 0.3 , while at the same time, their NCPR is close to 0, demonstrating that they are near-symmetrical polyampholytes. For the C1CR groups with long ICDs (1, 2 and 4), the group average FCRs (0.21; 0.23; 0.19) and NCPRs (− 0.06; − 0.06; − 0.05) are remarkably similar, suggesting that charge properties are a conserved trait. The similarity of these parameters also allows us to compare their κ values more directly, going from a group average of 0.20 for group 1, 0.18 for group 2 to 0.22 for group 4. This is consistent with the IDDomainSpotter analysis presented earlier (Fig. 2c). Here we found that almost all of the C1CR-ICDs harbored a PD immediately following their TMD, succeeded by an ND, and with net charge neutral regions for the remainder of the chain. Together, this suggest that the influence of the global charges and the charge patterning on the conformational ensembles are consistent throughout group 1, 2 and 4, except for IL-31Rβ (OSMR) (group 2) and IL-4Rα (group 4). As mentioned, the shorter ICDs of group 3 and 5 result in somewhat different global charge properties.
The Ω parameter both describes the patterning of the charged residues as well as of proline. Like for κ, Ω is normalized between 0 and 1, with Ω close to zero representing sequences with evenly distributed charges and prolines, while sequences with Ω close to 1 have highly segregated charges and prolines . It has been shown that when Ω approaches 1, the preference for expanded conformations increases . A high fraction of Pro (> ~ 15%) may cause more expanded conformations as Pro prefers to be solvated and promotes stiffness. Five of the C1CR-ICDs had a high fraction of Pro (15–19%): IL-27R, IL-6Rα, IL-11Rα, βc, IL-13Rα2 (Fig. 5a, top). The amino acid fraction and IDDomainSpotter analysis (Fig. 2) revealed that the Pros are unusually abundant in the C1CR-ICDs and close to equally distributed. From the CIDER analysis we found that Ω, like κ, is similar for many of the C1CR-ICDs, but is lower for the shorter sequences (Fig. 5a). This could simply be a consequence of the Pro-rich Box1 sequences, leading to relatively higher proline scores in the shorter sequences.
To summarize, the theoretical analysis of the C1CR-ICD sequence parameters known to influence compaction suggest that many of them may be similarly biased towards a specific degree of extension or compaction of their conformational ensembles, but that this degree cannot be predicted from sequence. Hence, to determine this bias, we experimentally investigated the degree of compaction and its responsiveness to salt by SAXS using two long, representative C1CR-ICDs, namely that of PRLR-LF-ICD (Fig. 5b,c) and GHR-LF-ICD (SI Fig. S6). The SAXS profiles of the PRLR-LF-ICD were consistent with those expected for fully disordered proteins, and was fitted to an Rg of 57.3 ± 1.4 Å in 20 mM phosphate buffer. The predicted Rg of the PRLR-LF-ICD for a fully random coil state is, according to Kohn et al. , 65 Å, suggesting that the PRLR-LF-ICD populate a slightly compacted ensemble. The pair distance distribution function (P(r)), which is a histogram of distance distributions within the protein, peaks at ~ 45 Å and has a Dmax of ~ 200 Å. Increasing the concentration of salt to 300 mM did not significantly affect the fitted Rg nor the P(r) distribution (Fig. 5b,c), suggesting that the global degree of compaction of PRLR-ICD is not sensitive to salt, as otherwise often observed for more charged IDPs , perhaps related to the high content of Pro and branched amino acids. The same trends were observed from SAXS on the GHR-LF-ICD, having a similarly slightly compacted ensemble that was insensitive to salt (SI Fig. S6). Hence, the ICDs across the C1CR family having similar global charge properties and patterning (Fig. 5a), may populate similarly compacted ensembles as also indicated from the Ω scores (0.15–0.35), although this remains to be experimentally more broadly verified.
Versatile and controlled orchestration of signaling by unique structural disorder in C1CRs
It is remarkable that the entire family of C1CRs, differentiating into > 50 isoforms, are all predicted to be disordered in their entire ICD sequence. Nonetheless, the disordered ICDs are critically understudied, leaving us with a naive and too simplistic schematic view of the ICDs as passive strings of varying lengths with kinases constitutively attached. In the present paper we have highlighted the properties linked to disorder responsible for controlling the diverse signaling by C1CRs (Table 2) and asked: Why has disorder been selected for governing intracellular C1CR signaling? Their complete disordered nature stands in contrast to the majority of other types of single-pass transmembrane receptors such as the receptor tyrosine kinases, where intracellular signaling is mainly governed by intrinsic kinase activity. We have here shown that the long disordered ICDs of C1CRs are brimming with clusters of multifunctional SLiMs throughout their length, suggesting that one explanation is the signaling versatility and scaffolding capacity of this type of ICD. Furthermore, we have outlined that overlapping SLiMs are prevalent in the C1CR-ICDs, hinting that the disordered ICDs further allow for complex regulation of diverse signaling through competition and regulation of interactions with a plethora of different binding partners through multispecificity. Thereby, activation becomes dependent on the coupled equilibria and kinetics of two (or more) binding events. Indeed, the ability of a distinct region in the disordered ICD to bind to many different proteins is facilitated by structural adaptation and folding-upon-binding [164, 165]. Additionally, the C1CR-ICDs are hot spots for multiple phosphorylation events of which only a few are well-characterized as binary on-off switches. This directly – or indirectly – affect affinities and additionally expands the number of states accessible by the chain at any time. We have further suggested that an additional layer to this regulation is added by the existence of different C1CR-ICD isoforms, in which entire groups of SLiMs can be eliminated and new ones added, a feature that is much easier for IDPs to successfully obtain during evolution compared to folded proteins. By controlled expression of the isoforms, a complete rewiring of the interaction network can be done. Hence, the full disordered nature of the C1CR-ICDs allows for a fascinatingly versatile and complex interaction hub.
Can such signal complexity be facilitated through a simple string with kinases attached? Our sequence analysis and experimental studies have revealed biases in the C1CR-ICDs that differentiate them from being simple statistical coils. They have conserved distinct compositional biases that differentiate them from other IDPs; biases that are distributed throughout their chains, including the presence of disordered domains of specific physiochemical properties. This suggests that these compositional biases are representing shared functionalities yet to be characterized. Our experimental SAXS data on the long ICDs of the archetypical receptors GHR and PRLR revealed that they are slightly more compacted than expected for a fully random coil (~ 57 Å versus 65 Å for PRLR-LF-ICD). This indeed suggests an inherent conformational bias based on the conservation of certain sequence properties maintained across the family. Importantly, however, it should be kept in mind that in the cell, the C1CR-ICDs are most likely never completely void of interactions at any point. Previous characterizations of PRLR-LF-ICD and GHR-LF-ICD have revealed the presence of LIDs, further suggesting distinct organizational features at the membrane interface where also many kinases are tethered. Additionally, we found that the same sets of SLiMs are placed differently in the C1CR-ICDs with variable distances, which may provide an additional tuning of the signaling outcome, both via the length of their disordered spacers as well as the properties of these . Thus, SLiM organization within the chain may imprint different affinities in different complexes despite their exploitation of identical SLiMs, providing an additional layer of the spatio-temporal orchestration of signaling. In fact, is it possible that disordered cytoplasmic domains generally can be classified by their collection of SLiMs, providing specific SLiM catalogues of disordered membranes proteins, but such decomposition will require a much broader analysis across many different protein families.
In conclusion, we suggest that the C1CR-ICDs are far from simple strings with constitutively bound kinases. Rather, they carry both organizational and operational features left uncovered in their disorder, but of key importance for understanding orchestration of signaling. How these features operate in the higher-order oligomers of the C1CRs, bringing ICDs from several chains in close proximity, increases the dimension of future studies. For example, the mere volume taken up by more chains may allow them to generate higher order ensembles of specific properties. In such disordered reaction chambers, lower affinity interactions may be boosted and even shared between chains adding features, mechanisms and complexities to regulations, also yet to be discovered. Taken together, it is evident that the understanding of the fascinating ability of these long, completely disordered chains to orchestrate complex signaling pathways is still in its infancy.
Availability of data and materials
All data generated or analyzed during this study are included in this published article and its additional information files.
Class 1 cytokine receptors
Fraction charged residues
Intrinsically disordered protein
Lipid interaction domain
Net charge per residue
Nuclear magnetic resonance
Small angle X-ray scattering
Secondary chemical shift
Short linear motif
Small unilamellar vesicles
Wright PE, Dyson HJ. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol. 1999;293:321–31 Academic Press.
Sigler PB. Acid blobs and negative noodles. Nature. 1988;333:210–2.
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004;337:635–45.
Kumar R, Thompson E. Role of phosphorylation in the modulation of the glucocorticoid receptor’s intrinsically disordered domain. Biomolecules. 2019;9:95.
Mitrea DM, Yoon M-K, Ou L, Kriwacki RW. Disorder-function relationships for the cell cycle regulatory proteins p21 and p27. Biol Chem. 2012;393:259–74.
Follis AV, Galea CA, Kriwacki RW. Intrinsic protein flexibility in regulation of cell proliferation: advantages for signaling and opportunities for novel therapeutics. Adv Exp Med Biol. 2012.
Staby L, O’Shea C, Willemoës M, Theisen F, Kragelund BB, Skriver K. Eukaryotic transcription factors: paradigms of protein intrinsic disorder. Biochem J. 2017;474:2509–32.
Uversky VN, Dunker AK. Understanding protein non-folding. Biochim Biophys Acta - Proteins Proteomics. 2010:1231–64 Elsevier.
Mohan A, Oldfield CJ, Radivojac P, Vacic V, Cortese MS, Dunker AK, et al. Analysis of molecular recognition features (MoRFs). J Mol Biol. 2006;362:1043–59.
Lee S-H, Kim D-H, Han JJ, Cha E-J, Lim J-E, Cho Y-J, et al. Understanding Pre-Structured Motifs (PreSMos) in Intrinsically Unfolded Proteins. Curr Protein Pept Sci. 2012;13:34–54 Bentham Science Publishers Ltd.
Chhabra Y, Wong HY, Nikolajsen LF, Steinocher H, Papadopulos A, Tunny KA, et al. A growth hormone receptor SNP promotes lung cancer by impairment of SOCS2-mediated degradation. Oncogene. 2018;37:489–501.
Krystkowiak I, Davey NE. SLiMSearch: a framework for proteome-wide discovery and annotation of functional modules in intrinsically disordered regions. Nucleic Acids Res. 2017;45.
Prestel A, Wichmann N, Martins JM, Marabini R, Kassem N, Broendum SS, et al. The PCNA interaction motifs revisited: thinking outside the PIP-box. Cell Mol Life Sci. 2019;76:4923–43 Springer.
Bugge K, Brakti I, Fernandes CB, Dreier JE, Lundsgaard JE, Olsen JG, et al. Interactions by disorder – a matter of context. Front Mol Biosci. 2020;7.
Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, et al. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32:1037–49.
Radivojac P, Vacic V, Haynes C, Cocklin RR, Mohan A, Heyen JW, et al. Identification, analysis, and prediction of protein ubiquitination sites. Proteins Struct Funct Bioinforma. 2010;78:365–80.
Lee CW, Ferreon JC, Ferreon ACM, Arai M, Wright PE. Graded enhancement of p53 binding to CREB-binding protein (CBP) by multisite phosphorylation. Proc Natl Acad Sci. 2010;107:19290–5 National Academy of Sciences.
Martin EW, Holehouse AS, Peran I, Farag M, Incicco JJ, Bremer A, et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science (80- ). 2020;367:694–9 American Association for the Advancement of Science.
Kim TH, Tsang B, Vernon RM, Sonenberg N, Kay LE, Forman-Kay JD. Phospho-dependent phase separation of FMRP and CAPRIN1 recapitulates regulation of translation and deadenylation. Science (80- ). 2019;365:825–9 American Association for the Advancement of Science.
Arai M, Sugase K, Dyson HJ, Wright PE. Conformational propensities of intrinsically disordered proteins influence the mechanism of binding and folding. Proc Natl Acad Sci U S A. 2015;112:9614–9 National Academy of Sciences.
Yang J, Gao M, Xiong J, Su Z, Huang Y. Features of molecular recognition of intrinsically disordered proteins via coupled folding and binding. Protein Sci. 2019;28:1952–65 Blackwell Publishing Ltd.
Shammas SL, Crabtree MD, Dahal L, BIM W, Clarke J. Insights into coupled folding and binding mechanisms from kinetic studies. J Biol Chem. 2016:6689–95 American Society for Biochemistry and Molecular Biology Inc.
Olsen JG, Teilum K, Kragelund BB. Behaviour of intrinsically disordered proteins in protein – protein complexes with an emphasis on fuzziness. Cell Mol Life Sci. 2017;74:3175–83 Springer International Publishing.
Borgia A, Borgia MB, Bugge K, Kissling VM, Heidarsson PO, Fernandes CB, et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature. 2018;555:61–6 Nature Publishing Group.
Kjaergaard M, Kragelund BB. Functions of intrinsic disorder in transmembrane proteins. Cell Mol Life Sci. 2017;74:3205–24 Birkhauser Verlag AG.
Hendus-Altenburger R, Kragelund BB, Pedersen SF. Structural Dynamics and Regulation of the Mammalian SLC9A Family of Na+/H+ Exchangers. Curr Top Membr. 2014;73:69–148 Academic Press Inc.
Warne T, Moukhametzianov R, Baker JG, Nehmé R, Edwards PC, Leslie AGW, et al. The structural basis for agonist and partial agonist action on a β1-adrenergic receptor. Nature. 2011;469:241–5 Nature Publishing Group.
Minezaki Y, Homma K, Nishikawa K. Intrinsically disordered regions of human plasma membrane proteins preferentially occur in the cytoplasmic segment. J Mol Biol. 2007;368:902–13 Academic Press.
Kassem N, Kassem MM, Pedersen SF, Pedersen PA, Kragelund BB. Yeast recombinant production of intact human membrane proteins with long intrinsically disordered intracellular regions for structural studies. Biochim Biophys Acta - Biomembr. 1862;2020:183272 Elsevier BV.
De Biasio A, Guarnaccia C, Popovic M, Uversky VN, Pintar A, Pongor S. Prevalence of intrinsic disorder in the intracellular region of human single-pass type I proteins: the case of the notch ligand Delta-4. J Proteome Res. 2008;7:2496–506.
Zeev-Ben-Mordehai T, Rydberg EH, Solomon A, Toker L, Auld VJ, Silman I, et al. The intracellular domain of the Drosophila cholinesterase-like neural adhesion protein, gliotactin, is natively unfolded. Proteins. 2003;53:758–67.
Brooks AJ, Dehkhoda F, Kragelund BB. Cytokine Receptors. Princ Endocrinol Horm Action. 2016.
Boulay J-L, O’Shea JJ, Paul WE. Molecular phylogeny within type I cytokines and their cognate receptors. Immunity. 2003;19:159–63 Cell Press.
Liongue C, Ward AC. Evolution of class I cytokine receptors. BMC Evol Biol. 2007;7:120.
Bugge K, Papaleo E, Haxholm GW, Hopper JTS, Robinson CV, Olsen JG, et al. A combined computational and structural model of the full-length human prolactin receptor. Nat Commun. 2016;7:11578 Nature Publishing Group.
Shields DC, Harmon DL, Nunez F, Whitehead AS. The evolution of haematopoietic cytokine/receptor complexes. Cytokine. 1995;7:679–88.
Brooks AJ, Waters MJ. The growth hormone receptor: mechanism of activation and clinical implications. Nat Rev Endocrinol. 2010;6:515–25 Nature Publishing Group.
Baker SJ, Rane SG, Reddy EP. Hematopoietic cytokine receptor signaling. Oncogene. 2007;26:6724–37 Nature Publishing Group.
Lebrun JJ, Ali S, Ullrich A, Kelly PA. Proline-rich sequence-mediated Jak2 association to the prolactin receptor is required but not sufficient for signal transduction. J Biol Chem. 1995:10664–70.
Rui H, Kirken RA, Farrar WL. Activation of receptor-associated tyrosine kinase JAK2 by prolactin. J Biol Chem. 1994;269:5364–8.
Pezet A, Buteau H, Kelly PA, Edery M. The last proline of box1 is essential for association with JAK2 and functional activation of the prolactin receptor. Mol Cell Endocrinol. 1997;129:199–208.
Bagley CJ, Woodcock JM, Stomski FC, Lopez AF. The structural and functional basis of cytokine receptor activation: lessons from the common β subunit of the granulocyte-macrophage colony-stimulating factor, Interleukin-3 (IL-3), and IL-5 receptors. Blood. 1997;89:1471–82 American Society of Hematology.
Murakami M, Narazaki M, Hibi M, Yawata H, Yasukawa K, Hamaguchi M, et al. Critical cytoplasmic region of the interleukin 6 signal transducer gp130 is conserved in the cytokine receptor family. Proc Natl Acad Sci U S A. 1991;88:11349–53 Natl Academy of Sciences.
Ferrao RD, Wallweber HJA, Lupardus PJ. Receptor-mediated dimerization of JAK2 FERM domains is required for JAK2 activation. Elife. 2018;7:1–21.
Ferrao R, Wallweber HJA, Ho H, Tam C, Franke Y, Quinn J, et al. The structural basis for class II cytokine receptor recognition by JAK1. Structure. 2016;24:897–905 Elsevier Ltd.
Wallweber HJA, Tam C, Franke Y, Starovasnik MA, Lupardus PJ. Structural basis of recognition of interferon-α receptor by tyrosine kinase 2. Nat Struct Mol Biol. 2014;21:443–8.
Tilbrook P, Palmer G, Bittorf T, McCarthy D, Wright M, Sarna M, et al. Maturation of Erythroid cells and erythroleukemia development are affected by the kinase activity of Lyn. Cancer Res. 2001;61:2453–8.
Rowlinson SW, Yoshizato H, Barclay JL, Brooks AJ, Behncken SN, Kerr LM, et al. An agonist-induced conformational change in the growth hormone receptor determines the choice of signalling pathway. Nat Cell Biol. 2008;10:740–7.
Lannutti BJ, Drachman JG. Lyn tyrosine kinase regulates thrombopoietin-induced proliferation of hematopoietic cell lines and primary megakaryocytic progenitors. Blood. 2004;103:3736–43 American Society of Hematology.
Clevenger CV, Medaglia MV. The protein tyrosine kinase P59fyn is associated with prolactin (PRL) receptor and is activated by PRL stimulation of T-lymphocytes. Mol Endocrinol. 1994;8:674–81.
Brooks AJ, Dai W, O’Mara ML, Abankwa D, Chhabra Y, Pelekanos RA, et al. Mechanism of Activation of Protein Kinase JAK2 by the Growth Hormone Receptor. Science (80- ). 2014;344:1249783–3.
Brown RJ, Adams JJ, Pelekanos RA, Wan Y, McKinstry WJ, Palethorpe K, et al. Model for growth hormone receptor activation based on subunit rotation within a receptor dimer. Nat Struct Mol Biol. 2005;12:814–21.
Buljan M, Chalancon G, Eustermann S, Wagner GP, Fuxreiter M, Bateman A, et al. Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol Cell. 2012;46:871–83.
Romero PR, Zaidi S, Fang YY, Uversky VN, Radivojac P, Oldfield CJ, et al. Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc Natl Acad Sci. 2006;103:8390–5.
Light S, Sagit R, Ekman D, Elofsson A. Long indels are disordered: a study of disorder and indels in homologous eukaryotic proteins. Biochim Biophys Acta - Proteins Proteomics. 1834;2013:890–7 Elsevier.
Light S, Sagit R, Sachenkova O, Ekman D, Elofsson A. Protein expansion is primarily due to indels in intrinsically disordered regions. Mol Biol Evol. 2013;30:2645–53.
Goffin V, Bogorad RL, Touraine P. Identification of Gain-of-Function Variants of the Human Prolactin Receptor. Methods Enzymol. 2010:329–55 1st ed. Elsevier Inc.
Bogorad RL, Courtillot C, Mestayer C, Bernichtein S, Harutyunyan L, Jomain J-B, et al. Identification of a gain-of-function mutation of the prolactin receptor in women with benign breast tumors. Proc Natl Acad Sci. 2008;105:14533–8.
Kline JB, Rycyzyn MA, Clevenger CV. Characterization of a novel and functional human prolactin receptor isoform (deltaS1PRLr) containing only one extracellular fibronectin-like domain. Mol Endocrinol. 2002;16:2310–22.
Tan D, Huang K, Ueda E. S2 deletion variants of human PRL receptors demonstrate that extracellular domain conformation can alter conformation of the intracellular signaling domain. Biochemistry. 2008;47:479–89.
Kline JB, Clevenger CV. Identification and characterization of the prolactin-binding protein in human serum and milk. J Biol Chem. 2001;276:24760–6.
Trott JF, Hovey RC, Koduri S, Vonderhaar BK. Alternative splicing to exon 11 of human prolactin receptor gene results in multiple isoforms including a secreted prolactin-binding protein. J Mol Endocrinol. 2003;30:31–47.
Hu Z-Z, Meng J, Dufau ML. Isolation and characterization of two novel forms of the human prolactin receptor generated by alternative splicing of a newly identified exon 11. J Biol Chem. 2001;276:41086–94.
Kline JB, Roehrs H, Clevenger CV. Functional characterization of the intermediate isoform of the human prolactin receptor. J Biol Chem. 1999;274:35461–8.
Pezet A, Ferrag F, Kelly PA, Edery M. Tyrosine docking sites of the rat prolactin receptor required for association and activation of Stat5. J Biol Chem. 1997;272:25043–50.
Bachelot A, Bouilly J, Liu Y, Rebourcet D, Leux C, Kuttenn F, et al. Sequence variation analysis of the prolactin receptor C-terminal region in women with premature ovarian failure. Fertil Steril. 2010;94:2772–5.
Lesueur L, Edery M, Ali S, Paly J, Kelly PA, Djiane J. Comparison of long and short forms of the prolactin receptor on prolactin-induced milk protein gene transcription. Proc Natl Acad Sci U S A. 1991;88:824–8.
Das R, Vonderhaar BK. Transduction of prolactin’s (PRL) growth signal through both long and short forms of the PRL receptor. Mol Endocrinol. 1995;9:1750–9.
Huang K, Ueda E, Chen Y, Walker AM. Paradigm-shifters: phosphorylated prolactin and short prolactin receptors. J Mammary Gland Biol Neoplasia. 2008;13:69–79.
Yamauchi T, Kaburagi Y, Ueki K, Tsuji Y, Stark GR, Kerr IM, et al. Growth hormone and prolactin stimulate tyrosine phosphorylation of insulin receptor substrate-1, −2, and −3, their association with p85 phosphatidylinositol 3-kinase (PI3-kinase), and concomitantly PI3-kinase activation via JAK2 kinase. J Biol Chem. 1998;273:15719–26.
Amaral MEC, Cunha DA, Anhê GF, Ueno M, Carneiro EM, Velloso LA, et al. Participation of prolactin receptors and phosphatidylinositol 3-kinase and MAP kinase pathways in the increase in pancreatic islet mass and sensitivity to glucose during pregnancy. J Endocrinol. 2004;183:469–76.
Meng J, Tsai-Morris C-H, Dufau ML. Human prolactin receptor variants in breast cancer: low ratio of short forms to the long-form human prolactin receptor associated with mammary carcinoma. Cancer Res. 2004;64:5677–82.
Huang K, Walker AM. Long term increased expression of the short form 1b prolactin receptor in PC-3 human prostate cancer cells decreases cell growth and migration, and causes multiple changes in gene expression consistent with reduced invasive capacity. Prostate. 2010;70:37–47.
Bugge K, Lindorff-Larsen K, Kragelund BB. Understanding single-pass transmembrane receptor signaling from a structural viewpoint - what are we missing? FEBS J. 2016;283:4424–51.
Li Q, Wong YL, Huang Q, Kang C. Structural insight into the Transmembrane domain and the Juxtamembrane region of the erythropoietin receptor in micelles. Biophys J. 2014;107:2325–36 Biophysical Society.
Li Q, Wong YL, Lee MY, Li Y, Kang C. Solution structure of the transmembrane domain of the mouse erythropoietin receptor in detergent micelles. Sci Rep. 2015;5:1–10 Nature Publishing Group.
Bocharov EV, Lesovoy DM, Bocharova OV, Urban AS, Pavlov KV, Volynsky PE, et al. Structural basis of the signal transduction via transmembrane domain of the human growth hormone receptor. Biochim Biophys Acta - Gen Subj. 2018;1862:1410–20 Elsevier B.V.
Schmidt T, Ye F, Situ AJ, An W, Ginsberg MH, Ulmer TS. A conserved ectodomain-transmembrane domain linker motif tunes the allosteric regulation of cell surface receptors. J Biol Chem. 2016;291:17536–46 American Society for Biochemistry and Molecular Biology Inc.
Kung WW, Ramachandran S, Makukhin N, Bruno E, Ciulli A. Structural insights into substrate recognition by the SOCS2 E3 ubiquitin ligase. Nat Commun. 2019;10:1–14 Springer US.
Zhou MM, Huang B, Olejniczak ET, Meadows RP, Shuker SB, Miyazaki M, et al. Structural basis for IL-4 receptor phosphopeptide recognition by the IRS-1 PTB domain. Nat Struct Biol. 1996;3:388–93 Nature Publishing Group.
Kang BS, Cooper DR, Jelen F, Devedjiev Y, Derewenda U, Dauter Z, et al. PDZ tandem of human syntenin: crystal structure and functional properties. Structure. 2003;11:459–68 Cell Press.
Haxholm GW, Nikolajsen LF, Olsen JG, Fredsted J, Larsen FH, Goffin V, et al. Intrinsically disordered cytoplasmic domains of two cytokine receptors mediate conserved interactions with membranes. Biochem J. 2015;468:495–506.
Subtil A, Delepierre M, Dautry-Varsat A. An α-helical signal in the cytosolic domain of the interleukin 2 receptor β chain mediates sorting towards degradation after endocytosis. J Cell Biol. 1997;136:583–95 The Rockefeller University Press.
O’Neal KD, Chari MV, Mcdonald CH, Cook RG, Yu-Lee LY, Morrisett JD, et al. Multiple cis-trans conformers of the prolactin receptor proline-rich motif (PRM) peptide detected by reverse-phase HPLC, CD and NMR spectroscopy. Biochem J. 1996;315(Pt 3):833–44.
Syed F, Rycyzyn MA, Westgate L, Clevenger CV. A novel and functional interaction between cyclophilin A and prolactin receptor. Endocrine. 2003;20:83–90.
Kay L, Keifer P, Saarinen T. Pure absorption gradient enhanced heteronuclear single quantum correlation spectroscopy with improved sensitivity. J Am Chem Soc. 1992;114:10663–5.
Wittekind M, Mueller L. HNCACB, a high-sensitivity 3D NMR experiment to correlate amide-proton and nitrogen resonances with the alpha- and Beta-carbon resonances in proteins. J Magn Reson Ser B. 1993;101:201–5.
Grzesiek S, Bax A. Correlating backbone amide and side chain resonances in larger proteins by multiple relayed triple resonance NMR. J Am Chem Soc. 1992;114:6291–3.
Kay LE, Ikura M, Tschudin R, Bax A. Three-dimensional triple-resonance NMR spectroscopy of isotopically enriched proteins. J Magn Reson. 1990;89:496–514.
Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR. 1995;6:277–93.
Vranken WF, Boucher W, Stevens TJ, Fogh RH, Pajon A, Llinas M, et al. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins Struct Funct Bioinforma. 2005;59:687–96.
Zhang H, Neal S, Wishart DS. RefDB: a database of uniformly referenced protein chemical shifts. J Biomol NMR. 2003;25:173–95.
Mulder FA, Schipper D, Bott R, Boelens R. Altered flexibility in the substrate-binding site of related native and engineered high-alkaline Bacillus subtilisins. J Mol Biol. 1999;292:111–23.
Blanchet CE, Spilotros A, Schwemmer F, Graewert MA, Kikhney A, Jeffries CM, et al. Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY). J Appl Crystallogr. 2015;48:431–43 International Union of Crystallography.
Debye P. Molecular-weight determination by light scattering. J Phys Colloid Chem. 1947;51:18–32.
Hansen S. Bayesian estimation of hyperparameters for indirect Fourier transformation in small-angle scattering. J Appl Crystallogr. 2000;33:1415–21.
Hansen S. BayesApp : a web site for indirect transformation of small-angle scattering data. J Appl Crystallogr. 2012;45:566–7 International Union of Crystallography (IUCr).
Kohn JE, Millett IS, Jacob J, Zagrovic B, Dillon TM, Cingel N, et al. Random-coil behavior and the dimensions of chemically unfolded proteins. Proc Natl Acad Sci U S A. National Academy of Sciences; 2004;101:12491–12496.
UniProt. A worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47:D506–15.
Millard PS, Bugge K, Marabini R, Boomsma W, Burow M, Kragelund BB. IDDomainSpotter: compositional bias reveals domains in long disordered protein regions—insights from transcription factors. Protein Sci. Blackwell Publishing Ltd. 2020;29:169–83.
Holehouse AS, Das RK, Ahad JN, Richardson MOG, Pappu RV. CIDER: resources to analyze sequence-ensemble relationships of intrinsically disordered proteins. Biophys J. Biophysical Society. 2017;112:16–21.
Crooks GE, Hon G, Chandonia J-M, Brenner SE. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–90.
Sonnhammer EL, von Heijne G, Krogh A. A hidden Markov model for predicting transmembrane helices in protein sequences. Proceedings Int Conf Intell Syst Mol Biol. 1998;6:175–82.
Piovesan D, Tabaro F, Paladin L, Necci M, Mičetí I, Mičetí M, et al. MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins. Nucleic Acids Res. 2018;46:471–6.
Cheng Y, LeGall T, Oldfield CJ, Dunker AK, Uversky VN. Abundance of intrinsic disorder in protein associated with cardiovascular disease. Biochemistry American Chemical Society. 2006;45:10448–60.
Gouw M, Michael S, Amano S’, Anchez HS’, Kumar M, Andr’ A, Zeke A, et al. The eukaryotic linear motif resource-2018 update. Nucleic Acids Res. 2018;46.
Song C, Ye M, Liu Z, Cheng H, Jiang X, Han G, et al. Systematic analysis of protein phosphorylation networks from Phosphoproteomic data. Mol cell proteomics. MCP Papers in Press. 2012;11:1070–83.
Giri JG, Kumaki S, Ahdieh M, Friend DJ, Loomis A, Shanebeck K, et al. Identification and cloning of a novel IL-15 binding protein that is structurally related to the alpha chain of the IL-2 receptor. EMBO J Wiley. 1995;14:3654–63.
Mészáros B, Erdös G, Dosztányi Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res. 2018;46:W329–37.
Sigalov AB, Uversky VN. Differential occurrence of protein intrinsic disorder in the cytoplasmic signaling domains of cell receptors. Self Nonself. 2011;2:55–72.
Nielsen JT, Mulder FAA. Quality and bias of protein disorder predictors. Sci Rep Nature Publishing Group. 2019;9:1–11.
Mészáros B, Simon I, Dosztányi Z. Prediction of Protein Binding Regions in Disordered Proteins. Casadio R, editor. PLoS Comput Biol. Public Libr Sci. 2009;5:e1000376.
Christensen LF, Staby L, Bugge K, O’Shea C, Kragelund BB, Skriver K. Evolutionary conservation of the intrinsic disorder-based radical-induced cell Death1 hub interactome. Sci Rep Nature Research. 2019;9:1–15.
Liu W, Xie Y, Ma J, Luo X, Nie P, Zuo Z, et al. IBS: an illustrator for the presentation and visualization of biological sequences: fig. 1. Bioinformatics. 2015;31:3359–61.
Theillet F-X, Kalmar L, Tompa P, Han K-H, Selenko P, Dunker AK, et al. The alphabet of intrinsic disorder. Intrinsically Disord Proteins. Informa UK Limited. 2013;1:e24360.
von Heijne G. Control of topology and mode of assembly of a polytopic membrane protein by positively charged residues. Nature. 1989;341:456–8.
Gorelik M, Davidson AR. Distinct peptide binding specificities of Src homology 3 (SH3) protein domains can be determined by modulation of local energetics across the binding interface. J Biol Chem. 2012;287:9168–77.
Lin Y, Currie SL, Rosen MK. Intrinsically disordered sequences enable modulation of protein phase separation through distributed tyrosine motifs. J Biol Chem. 2017;292:19110–20.
Wang J, Choi JM, Holehouse AS, Lee HO, Zhang X, Jahnel M, et al. A Molecular Grammar Governing the Driving Forces for Phase Separation of Prion-like RNA Binding Proteins. Cell. 2018;174:688–99 e16 Cell Press.
Dosztanyi Z, Meszaros B, Simon I. ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics. 2009;25:2745–6.
Rubin GM, Yandell MD, Wortman JR, Miklos GLG, Nelson CR, Hariharan IK, et al. Comparative genomics of the Eukaryotes. Science (80- ). 2000;287:2204–15.
Brown AM, Zondlo NJ. A propensity scale for type II Polyproline helices (PPII): aromatic amino acids in Proline-rich sequences strongly disfavor PPII due to Proline − aromatic interactions. Biochemistry. 2012;51:5041–51.
Ferreon JC, Hilser VJ. Thermodynamics of binding to SH3 domains: the energetic impact of Polyproline II (P II ) Helix formation †. Biochemistry. 2004;43:7787–97.
Ferrao R, Lupardus PJ. The Janus kinase (JAK) FERM and SH2 domains: bringing specificity to JAK–receptor interactions. Front Endocrinol (Lausanne). Frontiers. 2017;8:71.
Saksela K, Permi P. SH3 domain ligand binding: What’s the consensus and where’s the specificity? FEBS Lett. 2012;586:2609–14.
Fresno Vara JA, Cáceres MA, Silva A, Martín-Pérez J. Src family kinases are required for prolactin induction of cell proliferation. Mol Biol Cell. 2001;12:2171–83.
Floss DM, Mrotzek S, Klöcker T, Schröder J, Grötzinger J, Rose-John S, et al. Identification of canonical tyrosine-dependent and non-canonical tyrosine-independent STAT3 activation sites in the intracellular domain of the interleukin 23 receptor. J Biol Chem. 2013;288:19386–400.
Hörtner M, Nielsch U, Mayr LM, Heinrich PC, Haan S. A new high affinity binding site for suppressor of cytokine signaling-3 on the erythropoietin receptor. Eur J Biochem. 2002;269:2516–26.
Morris R, Kershaw NJ, Babon JJ. The molecular details of cytokine signaling via the JAK/STAT pathway. Protein Sci Blackwell Publishing Ltd. 2018;27:1984–2009.
Klingmüller U, Bergelson S, Hsiao JG, Lodish HF. Multiple tyrosine residues in the cytosolic domain of the erythropoietin receptor promote activation of STAT5. Proc Natl Acad Sci U S A National Academy of Sciences. 1996;93:8324–8.
Stahl N, Farruggella TJ, Boulton TG, Zhong Z, Darnell JE, Yancopoulos GD. Choice of STATs and other substrates specified by modular tyrosine-based motifs in cytokine receptors. Science (80- ). 1995;267:1349–53 American Association for the Advancement of Science.
Thiel S, Behrmann I, Timmermann A, Dahmen H, Müller-Newen G, Schaper F, et al. Identification of a Leu-Ile internalization motif within the cytoplasmic domain of the leukaemia inhibitory factor receptor. Biochem J. 1999;339:15–9.
Amano Y, Yoshino K, Kojima K, Takeshita T. A hydrophobic amino acid cluster inserted into the C-terminus of a recycling cell surface receptor functions as an endosomal sorting signal. Biochem Biophys Res Commun. Elsevier Inc. 2013;441:164–8.
Hunter MG, McLemore M, Link DC, Loveland M, Copelan A, Avalos BR. Divergent pathways in COS-7 cells: mediate defective internalization and intracellular routing of truncated G-CSFR forms in SCN/AML. PLoS One. 2008;3:1–8.
Hitchcock IS, Chen MM, King JR, Kaushansky K. YRRL motifs in the cytoplasmic domain of the thrombopoietin receptor regulate receptor internalization and degradation. Blood. 2008;112:2222–31.
Da Silva Almeida AC, Strous GJ, van Rossum AGSH. βTrCP controls GH receptor degradation via two different motifs. Mol Endocrinol The Endocrine Society. 2012;26:165–77.
Li Y, Suresh Kumar KG, Tang W, Spiegelman VS, Fuchs SY, et al. Mol Cell Biol. American Society for Microbiology. 2004;24:4038–48.
Meyer L, Deau BD, Forejtníková H, Duménil D, Margottin-Goguet F, Lacombe C, et al. β-Trcp mediates ubiquitination and degradation of the erythropoietin receptor and controls cell proliferation. Blood. 2007;109:5215–22.
Hörtner M, Nielsch U, Mayr LM, Johnston JA, Heinrich PC, Haan S. Suppressor of cytokine Signaling-3 is recruited to the activated granulocyte-Colony stimulating factor receptor and modulates its signal transduction. J Immunol The American Association of Immunologists. 2002;169:1219–27.
Bergamin E, Wu J, Hubbard SR. Structural basis for Phosphotyrosine recognition by suppressor of cytokine Signaling-3. Structure Elsevier. 2006;14:1285–92.
Kershaw NJ, Murphy JM, Liau NPD, Varghese LN, Laktyushin A, Whitlock EL, et al. SOCS3 binds specific receptor-JAK complexes to control cytokine signaling by direct kinase inhibition. Nat Struct Mol Biol Nature Publishing Group. 2013;20:469–76.
Berk AJ. Recent lessons in gene expression, cell cycle control, and cell biology from adenovirus. Oncogene. Nat Publ Group; 2005. p. 7673–7685.
Bjørbæk C, Lavery HJ, Bates SH, Olson RK, Davis SM, Flier JS, et al. SOCS3 mediates feedback inhibition of the leptin receptor via Tyr985. J Biol Chem American Society for Biochemistry and Molecular Biology. 2000;275:40649–57.
Nicholson SE, De Souza D, Fabri LJ, Corbin J, Willson TA, Zhang JG, et al. Suppressor of cytokine signaling-3 preferentially binds to the SHP-2-binding site on the shared cytokine receptor subunit gp130. Proc Natl Acad Sci U S A. National Academy of Sciences. 2000;97:6493–8.
Sasaki A, Yasukawa H, Shouda T, Kitamura T, Dikic I, Yoshimura A. CIS3/SOCS-3 suppresses erythropoietin (EPO) signaling by binding the EPO receptor and JAK2. J Biol Chem. American Society for Biochemistry and Molecular Biology. 2000;275:29338–47.
Argetsinger LS, Campbell GS, Yang X, Witthuhn BA, Ihle JN, Carter-su C. Identification of JAK2 as a growth hormone receptor-associated tyrosine kinase. Cell. 1993;74:237–44.
Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nat rev Mol cell biol. Nat Publ Group. 2015;16:18–29.
Olayioye MA, Guthridge MA, Stomski FC, Lopez AF, Visvader JE, Lindeman GJ. Threonine 391 phosphorylation of the human prolactin receptor mediates a novel interaction with 14-3-3 proteins. J Biol Chem. American Society for Biochemistry and Molecular Biology. 2003;278:32929–35.
Zheng J, Koblinski JE, Dutson LV, Feeney YB, Clevenger CV. Prolyl isomerase cyclophilin A regulation of Janus-activated kinase 2 and the progression of human breast cancer. Cancer Res. American Association for Cancer Research. 2008;68:7769–78.
Sliva D, Gu M, Zhu YX, Chen J, Tsai S, Du X, et al. 14–3-3zeta interacts with the alpha-chain of human interleukin 9 receptor. Biochem J. Portland Press Ltd. 2000;345(Pt 3):741–7.
Amit T, Bergman T, Dastot F, Youdim MBH, Amselem S, Hochberg Z. A membrane-fixed, truncated isoform of the human growth hormone Receptor1. J Clin Endocrinol Metab. The Endocrine Society. 1997;82:3813–7.
Bezprozvanny I, Maximov A. Classification of PDZ domains. FEBS Lett. John Wiley & Sons. Ltd. 2001;509:457–62.
Sheng M, Sala C. PDZ domains and the Organization of Supramolecular Complexes. Annu Rev Neurosci Annual Reviews. 2001;24:1–29.
Cunningham R, Biswas R, Steplock D, Shenolikar S, Weinman E. Role of NHERF and scaffolding proteins in proximal tubule transport. Urol Res Springer. 2010;38:257–62.
Kalia LV, Salter MW. Interactions between Src family protein tyrosine kinases and PSD-95. Neuropharmacology. Pergamon. 2003;45:720–8.
Bauer J, Bakke O, Morth JP. Overview of the membrane-associated RING-CH (MARCH) E3 ligase family. N Biotechnol. Elsevier B.V. 2017;38:7–15.
Ruff KM. Predicting Conformational Properties of Intrinsically Disordered Proteins from Sequence. In: Kragelund BB, Skriver K, editors. Intrinsically Disord Proteins Methods Protoc Methods Mol Biol: Springer Science+Business Media, LLC, part of Springer Nature. In press; 2020. p. 2141.
Marsh JA, Forman-Kay JD. Sequence determinants of compaction in intrinsically disordered proteins. Biophys J. 2010;98:2383–90.
Das RK, Ruff KM, Pappu RV. Relating sequence encoded information to form and function of intrinsically disordered proteins. Curr Opin Struct biol. Elsevier Ltd. 2015;32:102–12.
Ginell GM, Holehouse AS. Analyzing the Sequences of Intrinsically Disordered Regions with CIDER and localCIDER. In: Kragelund BB, Skriver K, editors. Intrinsically Disord Proteins Methods Protoc Methods Mol Biol: Springer Science+Business Media, LLC, part of Springer Nature. In press; 2020. p. 2141.
Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci. 2013;110:13392–7.
Martin EW, Holehouse AS, Grace CR, Hughes A, Pappu RV, Mittag T. Sequence determinants of the conformational properties of an intrinsically disordered protein prior to and upon multisite phosphorylation. J am Chem Soc. Am Chem Soc. 2016;138:15323–35.
Müller-Späth S, Soranno A, Hirschfeld V, Hofmann H, Rüegger S, Reymond L, et al. Charge interactions can dominate the dimensions of intrinsically disordered proteins. Proc Natl Acad Sci U S A. National Academy of Sciences. 2010;107:14609–14.
Oldfield CJ, Meng J, Yang JY, Yang MQ, Uversky VN, Dunker AK. Flexible nets: disorder and induced fit in the associations of p53 and 14-3-3 with their partners. BMC Genomics. 2008;9:S1.
Dyson HJ, Wright PE. Coupling of folding and binding for unstructured proteins. Curr Opin Struct Biol. 2002;12:54–60.
Sørensen CS, Kjaergaard M. Effective concentrations enforced by intrinsically disordered linkers are governed by polymer physics. Proc Natl Acad Sci U S A. 2019;116:23124–31.
We would like to acknowledge all the important work done on decomposing the regions, domains and residues of the C1CR-ICDs essential to signaling and apologize to those whose work we did not cite. We thank Johan G. Olsen for valuable discussions and Mikkel M. for support. The synchrotron SAXS data was collected at beamline P12 operated by EMBL Hamburg at the PETRA III storage ring (DESY, Hamburg, Germany). We would like to thank Melissa Graewert for assistance during the beamtime.
This paper is a contribution from REPIN – rethinking protein interactions funded by the Novo Nordisk Foundation Challenge program (grant #NNF18OC0033926). The project was also funded by the Novo Nordisk Foundation SYNERGY program (grant#NNF15OC0016670).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
: Disorder prediction for group 2, group 3, group 4 and group 5 C1CRs, but not the three common receptors. Figure S2: Disorder prediction for PRLR isoforms. Figure S3: Fractional differences in composition between the different C1CR-ICD groups or a set of IDPs, and a set of folded proteins calculated for each amino acid type. Figure S4: Sequence logos for Box1 shown for group 2, group 3, group 4 and group 5. Figure S5: R1 and R2 relaxation rates for PRLR-SF1b-ICD. FigureS6: Small-angle X-ray diffraction analyses of GHR-LF-ICD. Table S1: Proline cis-trans populations in PRLR-LF-ICD and PRLR-SF1b-ICD. Table S2: Overview of SLiMs lost and gained in C1CRs isoforms with unique sequences. Supplemental data: Interpretation of the diagram of states and conformational properties.
About this article
Cite this article
Seiffert, P., Bugge, K., Nygaard, M. et al. Orchestration of signaling by structural disorder in class 1 cytokine receptors. Cell Commun Signal 18, 132 (2020). https://doi.org/10.1186/s12964-020-00626-6
- Structural biology
- Cytokine receptors
- Transmembrane receptors