Commit 1b16e862 authored by Denys Bulavka's avatar Denys Bulavka
Browse files

First commit

parent 33dac76c
# elm_processing
Database description:
The database was downloaded on the 22nd of May of 2015, named elm_original_20150522.csv, version history in http://elm.eu.org/infos/news.html. This file can be found in the folder 'database'. This file has 210 motifs.
We make the following simplifications:
^(at protein start) - we ignore this
$ (at end of protein) - we ignore this
| OR - take the shorter version
{0,1} - take the shorter version
(A) aminoacid modification - we ignore this
We also ignore the following ones:
*TRG_PTS2 ^.{1,40}R[^P][^P][^P][LIV][^P][^P][HQ][LIF]
*LIG_PCNA_PIPBox_1 ((^.{0,3})|(Q)).[^FHWY][ILM][^P][^FHILVWYP][HFM][FMY]..
This generated a list of 208 motifs. The file we will be working with is "database/elm_input.txt".
Software for processing elm database
\ No newline at end of file
Analysis description:
We will take the following approach to the analysis:
Trim start and end positions with more than 10 characteres.
We identified motifs that:
1. Have the same biological role
2. Described as "minor variants"
The file 'database/marked_motifs.txt' has the list of motifs we will not take into account from elm_input_modif.txt.
Software:
Elm_processing:
Compile the code:
cd elm_processing
make
After compilation the binary will be placed in the folder elm_processing/bin.
Generate files for python sripts:
The corresponding input files for python scripts are already generated. But if needed, they can be generated using the following commands from the directory elm_processing/bin
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 6 >> structure_empiric_frequency.txt
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 7 >> structure_theoretic_probabilities.txt
Other available reports:
Number of aminoacids per coordinate:
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 1
Aminoacids per coordinate:
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 2
Number of k discriminant motifs:
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 3 -k NUMBER
Pairs of k discriminant motifs:
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 4 -k NUMBER
For each k, outputs the number of pairs of motifs with k discriminant positions:
./main -d ../../database/elm_input.txt -c -m ../../database/marked_motifs.txt -r 5
Python scripts:
Each python script has an input directoy and an output directory. To excecute the each python script it is enought to be inside the script folder and run 'python main.py' which will generate the output and place it in output directory. The rule is the script that has "empiric" in is name should have "structure_empiric_frequency.txt" in its input folder, while the script that has "theoretic" in its name should have "structure_theoretic_probabilities.txt" in its input folder.
CLV_C14_Caspase3-7 [DSTE][^P][^DEWHFYC]D[GSAN]
CLV_MEL_PAP_1 [ILV]..R[VF][GS].
CLV_NRD_NRD_1 (.RK)|(RR[^KR])
CLV_PCSK_FUR_1 R.[RK]R.
CLV_PCSK_KEX2_1 [KR]R.
CLV_PCSK_PC1ET2_1 KR.
CLV_PCSK_PC7_1 R...[KR]R.
CLV_PCSK_SKI1_1 [RK].[AILMFV][LTKF].
CLV_Separin_Fungi S[IVLMH]E[IVPFMLYAQR]GR.
CLV_Separin_Metazoa E[IMPVL][MLVP]R.
CLV_TASPASE1 Q[MLVI]DG..[DE]
DEG_APCC_DBOX_1 .R..L..[LIVM].
DEG_APCC_KENBOX_2 .KEN.
DEG_APCC_TPR_1 .[ILM]R$
DEG_COP1 [DE][DE]...VP[DE]
DEG_CRL4_CDT2_1 [NQ]{0,1}..[ILMV][ST][DEN][FY][FY].{2,3}[KR]{2,3}[^DE]
DEG_CRL4_CDT2_2 [NQ]{0,1}..[ILMV]T[DEN][HMFY][FMY].{2,3}[KR]{2,3}[^DE]
DEG_MDM2_1 F...W..[LIV]
DEG_Nend_UBRbox_4 ^M{0,1}(C).
DEG_ODPH_VHL_1 [IL]A(P).{6,8}[FLIVM].[FLIVM]
DEG_SCF_COI1_1 ..[RK][RK].SL..F[FLM].[RK]R[HRK].[RK].
DEG_SCF_FBW7_1 [LIVMP].{0,2}(T)P..([ST])
DEG_SCF_FBW7_2 [LIVMP].{0,2}(T)P..E
DEG_SCF_SKP2-CKS1_1 ..[DE].(T)P.K
DEG_SCF_TIR1_1 .[VLIA][VLI]GWPP[VLI]...R.
DEG_SCF_TRCP1_1 D(S)G.{2,3}([ST])
DEG_SIAH_1 .P.A.V.P[^P]
DOC_AGCK_PIF_1 F..[FWY][ST][FY]
DOC_AGCK_PIF_2 F..[FWY][DE][FY]
DOC_AGCK_PIF_3 F..F$
DOC_ANK_TNKS_1 .R..[PGAV][DEIP]G.
DOC_CKS1_1 [MPVLIFWYQ].(T)P..
DOC_CYCLIN_1 [RK].L.{0,1}[FYLIVMP]
DOC_MAPK_1 [KR]{0,2}[KR].{0,2}[KR].{2,4}[ILVM].[ILVF]
DOC_MAPK_2 F.FP
DOC_PIKK_1 [DEN][DEN].{2,3}[ILMVA][DEN][DEN]L
DOC_PP1_RVXF_1 ..[RK].{0,1}[VIL][^P][FW].
DOC_PP1_SILK_1 .[GS]IL[KR][^DE]
DOC_PP2B_1 .P[^P]I[^P][IV][^P]
DOC_PP2B_2 L.[LIVAPM]P
DOC_SPAK_OSR1_1 RF[^P][IV].
DOC_USP7_1 [PA][^P][^FYWIL]S[^P]
DOC_USP7_2 P.E[^P].S[^P]
DOC_WD40_RPTOR_TOS_1 F[EDQS][MILV][ED][MILV]((.{0,1}[ED])|($))
DOC_WW_Pin1_4 ...([ST])P.
LIG_14-3-3_1 R.[^P]([ST])[^P]P
LIG_14-3-3_2 R..[^P]([ST])[IVLM].
LIG_14-3-3_3 [RHK][STALV].([ST]).[PESRDIFTQ]
LIG_Actin_RPEL_3 [IL]..[^P][^P][^P][^P]R.....[IL]..[^P][^P][ILV][ILM]
LIG_Actin_WH2_1 R..[ILVMF][ILMVF][^P][^P][ILVM].{4,7}L(([KR].)|(NK))[VATI]
LIG_Actin_WH2_2 [^R]..((.[ILMVF])|([ILMVF].))[^P][^P][ILVM].{4,7}L(([KR].)|(NK))[VATIGS]
LIG_AP2alpha_1 F.D.F
LIG_AP2alpha_2 DP[FW]
LIG_APCC_Cbox_1 [DE]R[YFH][ILFVM][PAG].R
LIG_APCC_Cbox_2 DR[YFH][ILFVM][PA]..
LIG_AP_GAE_1 [DE][DES][DEGAS]F[SGAD][DEAP][LVIMFD]
LIG_BIR_III_1 ^M{0,1}A.P.
LIG_BIR_III_2 DA.P.
LIG_BIR_III_3 ^M{0,1}A.[AP].
LIG_BIR_III_4 DA.G.
LIG_BRCT_BRCA1_1 .(S)..F
LIG_BRCT_BRCA1_2 .(S)..F.K
LIG_BRCT_MDC1_1 .(S)..Y$
LIG_CaMK_CASK_1 ((SP)|([ED].{0,1}))[IV]W[IVL].R
LIG_CAP-Gly_1 [ED].{0,2}[ED].{0,2}[EDQ].{0,1}[YF]$
LIG_CAP-Gly_2 .W[RK][DE]GCY$
LIG_Clathr_ClatBox_1 L[IVLMF].[IVLMF][DE]
LIG_Clathr_ClatBox_2 .[NP]W[DES].W
LIG_CORNRBOX L[^P]{2,2}[HI]I[^P]{2,2}[IAV][IL]
LIG_CtBP_PxDLS_1 (P[LVIPME][DENS][LM][VASTRG])|(G[LVIPME][DENS][LM][VASTRG]((K)|(.[KR])))
LIG_Dynein_DLC8_1 [^P].[KR].TQT
LIG_EABR_CEP55_1 .A.GPP.{2,3}Y.
LIG_EF_ALG2_ABM_1 P[PG]{0,1}YP.{1,6}Y[QS]{0,1}P
LIG_EF_ALG2_ABM_2 P.P.{0,1}GF
LIG_EH_1 .NPF.
LIG_EH1_1 .[FYH].[IVM][^WFYP][^WFYP][ILM][ILMV].
LIG_eIF4E_1 Y....L[VILMF]
LIG_eIF4E_2 Y.PP.[ILMV]R
LIG_EVH1_1 ([FYWL]P.PP)|([FYWL]PP[ALIVTFY]P)
LIG_EVH1_2 PP..F
LIG_EVH1_3 [FY].[FW].....[LMVIF]P.P[DE]
LIG_FAT_LD_1 [LV][DE][^P][LM][LM][^P][^P]L[^P]
LIG_FHA_1 ..(T)..[ILV].
LIG_FHA_2 ..(T)..[DE].
LIG_GLEBS_BUB3_1 [EN][FYLW][NSQ].EE[ILMVF][^P][LIVMFA]
LIG_GYF [QHR].{0,1}P[PL]PP[GS]H[RH]
LIG_HCF-1_HBM_1 [DE]H.Y
LIG_HOMEOBOX [FY][DEP]WM
LIG_HP1_1 P[MVLIRWY]V[MVLIAS][LM]
LIG_Integrin_isoDGR_1 NGR
LIG_IQ ...[SACLIVTM]..[ILVMFCT]Q.{3,3}[RK].{4,5}[RKQ]..
LIG_KEPE_1 [VILMFT]K.EP.[DE]
LIG_KEPE_2 [VILMFT]K.EP.{2,3}[DE]
LIG_KEPE_3 [VILMFT]K.EP....[DE]
LIG_LIR_Apic_2 [EDST].{0,2}[WFY]..P
LIG_LIR_Gen_1 [EDST].{0,2}[WFY]..[ILV]
LIG_LIR_LC3C_4 [EDST].{0,2}LVV
LIG_LIR_Nem_3 [EDST].{0,2}[WFY]..[ILVFY]
LIG_LYPXL_L_2 [LM]YP...[LI][^P][^P][LI]
LIG_LYPXL_S_1 [LM]YP.[LI]
LIG_MAD2 [KR][IV][LV].....P
LIG_MYND_1 P.L.P
LIG_MYND_2 PP.LI
LIG_MYND_3 [LMV]P.LE
LIG_NBox_RRM_1 F..A[ILV]..A..[ILV]
LIG_NRBOX [^P]L[^P][^P]LL[^P]
LIG_OCRL_FandH_1 .F[^P][^P][KRIL]H[^P][^P][YLMFH][^P]...
LIG_PAM2_1 ..[LFP][NS][PIVTAFL].A..(([FY].[PYLF])|(W..)).
LIG_PAM2_2 ((WPP)|([FL][PV][APQ]))EF.PG.PWKG.
*LIG_PCNA_PIPBox_1 ((^.{0,3})|(Q)).[^FHWY][ILM][^P][^FHILVWYP][HFM][FMY]..
LIG_PDZ_Class_1 ...[ST].[ACVILF]$
LIG_PDZ_Class_2 ...[VLIFY].[ACVILF]$
LIG_PDZ_Class_3 ...[DE].[ACVILF]$
LIG_PTAP_UEV_1 .P[TS]AP.
LIG_PTB_Apo_2 (.[^P].NP.[FY].)|(.[ILVMFY].N..[FY].)
LIG_PTB_Phospho_1 (.[^P].NP.(Y))|(.[ILVMFY].N..(Y))
LIG_Rb_LxCxE_1 [LI].C.[DE]
LIG_Rb_pABgroove_1 ..[LIMV]..[LM][FY]D.
LIG_RGD RGD
LIG_RRM_PRI_1 .[ILVM]LG..P.
LIG_SH2_GRB2 (Y).N.
LIG_SH2_PTP2 (Y)[IV].[VILP]
LIG_SH2_SRC (Y)[QDEVAIL][DENPYHI][IPVGAHS]
LIG_SH2_STAT3 (Y)..Q
LIG_SH2_STAT5 (Y)[VLTFIC]..
LIG_SH2_STAT6 G(Y)[KQ].F
LIG_SH3_1 [RKY]..P..P
LIG_SH3_2 P..P.[KR]
LIG_SH3_3 ...[PV]..P
LIG_SH3_4 KP..[QK]...
LIG_SH3_5 P..DY
LIG_Sin3_1 [LIV]..[LM]L.AA.[FY][LI]
LIG_Sin3_2 [FHYM].A[AV].[VAC]L[MV].[MI]
LIG_Sin3_3 [FA].[LA][LV][LVI]..[AM]
LIG_SPRY_1 [ED][LIV]NNN[^P]
LIG_SUFU_1 [SV][CY]GH[LIF][LAST][GAIV].
LIG_SUMO_SBM_1 [ILV](.[ILV]|[ILV]|[ILV].)[ILV][STDE]{1,10}
LIG_SUMO_SBM_2 [STDE]{1,10}[ILV](.[ILV]|[ILV]|[ILV].)[ILV]
LIG_SxIP_EBH_1 ([KR][^ED]{0,5}[ST].IP[^ED]{5,5})|([^ED]{5,5}[ST].IP[^ED]{0,5}[KR])
LIG_TPR EEVD$
LIG_TRAF2_1 [PSAT].[QE]E
LIG_TRAF2_2 P.Q..D
LIG_TRAF6 ..P.E..[FYWHDE].
LIG_TRFH_1 [FY].L.P
LIG_TYR_ITAM [DEN]..(Y)..[LI].{6,12}(Y)..[LI]
LIG_TYR_ITIM [ILV].(Y)..[ILV]
LIG_TYR_ITSM ..T.(Y)..[IV]
LIG_ULM_U2AF65_1 [KR]{1,4}[KR].[KR]W.
LIG_WD40_WDR5_1 [ED].{0,3}[VI]D[VI]
LIG_WD40_WDR5_2 [EDSTY].{0,4}[VIPLA][TSDEKR][ILVA]
LIG_WD40_WDR5_WIN_1 [HN].[HNST]G[SCA]AR[STAC][EQ][GPVILM][YFHKRQN][YHLIVMATS]
LIG_WD40_WDR5_WIN_2 [HNCSVI]..[GDE][STCA][AGVS]R[STCA][EQR][GPLAV]
LIG_WD40_WDR5_WIN_3 [HNSTE].[TSQN]P{0,1}GS{0,1}[SCA][AFWH][KR][TAS][DEQ][GP][RKYFIVAMW]..[IVM]
LIG_WH1 ES[RK][FY].F[HR][PST][IVLM][DES][DE]
LIG_WRPW_1 [WFY]RP[WFY].{0,7}$
LIG_WRPW_2 [WFY][KR]P[WFY]
LIG_WW_1 PP.Y
LIG_WW_2 PPLP
LIG_WW_3 .PPR.
MOD_ASX_betaOH_EGF C.([DN]).{4,4}[FY].C.C
MOD_CAAXbox (C)[^DENQ][LIVM].$
MOD_CDK_1 ...([ST])P.[KR]
MOD_CK1_1 S..([ST])...
MOD_CK2_1 ...([ST])..E
MOD_CMANNOS (W)..W
MOD_GlcNHglycan [ED]{0,3}.(S)[GA].
MOD_GSK3_1 ...([ST])...[ST]
MOD_LATS_1 H.[KR]..([ST])[^P]
MOD_NEK2_1 [FLM][^P][^P]([ST])[^DEP][^DE]
MOD_N-GLC_1 .(N)[^P][ST]..
MOD_N-GLC_2 (N)[^P]C
MOD_NMyristoyl ^M{0,1}(G)[^EDRKHPFYW]..[STAGCN][^P]
MOD_OFUCOSY C.{3,5}([ST])C
MOD_OGLYCOS C.(S).PC
MOD_PIKK_1 ...([ST])Q..
MOD_PK_1 [RK]..(S)[VI]..
MOD_PKA_1 [RK][RK].([ST])[^P]..
MOD_PKA_2 .R.([ST])[^P]..
MOD_PKB_1 R.R..([ST])[^P]..
MOD_PLK .[DE].([ST])[ILFWMVA]..
MOD_ProDKin_1 ...([ST])P..
MOD_SPalmitoyl_2 G(C)M[GS][CL][KP]C
MOD_SPalmitoyl_4 ^M{0,1}G(C)..S[AKS]
MOD_SUMO [VILMAFP](K).E
MOD_TYR_CSK [TAD][EA].Q(Y)[QE].[GQA][PEDLS]
MOD_TYR_DYR ..[RKTC][IVL]Y[TQHS](Y)[IL]QSR
MOD_WntLipid [ETA](C)[QERK]..F...RWNC[ST]
TRG_AP2beta_CARGO_1 [DE].{1,2}F[^P][^P][FL][^P][^P][^P]R
TRG_Cilium_Arf4_1 QV.P.$
TRG_Cilium_RVxP_2 RV.P.
TRG_ENDOCYTIC_2 Y..[LMVIF]
TRG_ER_diArg_1 ([LIVMFYWPR]R[^YFWDE]{0,1}R)|(R[^YFWDE]{0,1}R[LIVMFYWPR])
TRG_ER_diLys_1 K.{0,1}K.{2,3}$
TRG_ER_FFAT_1 [DE].{0,4}E[FY][FYK]D[AC].[ESTD]
TRG_ER_KDEL_1 [KRHQSAP][DENQT]EL$
TRG_Golgi_diPhe_1 Q.{6,6}FF.{6,7}$
TRG_LysEnd_APsAcLL_1 [DERQ]...L[LVI]
TRG_LysEnd_APsAcLL_3 [DET]E[RK].PL[LI]
TRG_LysEnd_GGAAcLL_1 D..LL.{1,2}$
TRG_LysEnd_GGAAcLL_2 S[LW]LD[DE]EL[LM]
TRG_NES_CRM1_1 ([DEQ].{0,1}[LIM].{2,3}[LIVMF][^P]{2,3}[LMVF].[LMIV].{0,3}[DE])|([DE].{0,1}[LIM].{2,3}[LIVMF][^P]{2,3}[LMVF].[LMIV].{0,3}[DEQ])
TRG_NLS_Bipartite_1 [KR][KR].{7,15}[^DE]((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))[^DE]
TRG_NLS_MonoCore_2 [^DE]((K[RK])|(RK))[KRP][KR][^DE]
TRG_NLS_MonoExtC_3 [^DE]((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))(([PKR])|([^DE][DE]))
TRG_NLS_MonoExtN_4 (([PKR].{0,1}[^DE])|([PKR]))((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))[^DE]
TRG_PEX_1 W...[FY]
TRG_PEX_2 F...[WF]
TRG_PEX_3 L..LL...L..F
TRG_PTS1 (.[SAPTC][KRH][LMFI]$)|([KRH][SAPTC][NTS][LMFI]$)
*TRG_PTS2 ^.{1,40}R[^P][^P][^P][LIV][^P][^P][HQ][LIF]
CLV_C14_Caspase3-7 Caspase-3 and Caspase-7 cleavage site. [DSTE][^P][^DEWHFYC]D[GSAN] 39 http://elm.eu.org/elms/elmPages/CLV_C14_Caspase3-7.html
CLV_MEL_PAP_1 Prophenoloxidase-activating proteinase (PAP) cleavage site ([ILV]-X-X-R-|-[FV]-[GS]-X). [ILV]..R[VF][GS]. 12 http://elm.eu.org/elms/elmPages/CLV_MEL_PAP_1.html
CLV_NRD_NRD_1 N-Arg dibasic convertase (NRD/Nardilysin) cleavage site (X-|-R-K or R-|-R-X). (.RK)|(RR[^KR]) 2 http://elm.eu.org/elms/elmPages/CLV_NRD_NRD_1.html
CLV_PCSK_FUR_1 Furin (PACE) cleavage site (R-X-[RK]-R-|-X). R.[RK]R. 13 http://elm.eu.org/elms/elmPages/CLV_PCSK_FUR_1.html
CLV_PCSK_KEX2_1 Yeast kexin 2 cleavage site (K-R-|-X or R-R-|-X). [KR]R. 1 http://elm.eu.org/elms/elmPages/CLV_PCSK_KEX2_1.html
CLV_PCSK_PC1ET2_1 NEC1/NEC2 cleavage site (K-R-|-X). KR. 6 http://elm.eu.org/elms/elmPages/CLV_PCSK_PC1ET2_1.html
CLV_PCSK_PC7_1 Proprotein convertase 7 (PC7, PCSK7) cleavage site (R-X-X-X-[RK]-R-|-X). R...[KR]R. 1 http://elm.eu.org/elms/elmPages/CLV_PCSK_PC7_1.html
CLV_PCSK_SKI1_1 Subtilisin/kexin isozyme-1 (SKI1) cleavage site ([RK]-X-[hydrophobic]-[LTKF]-|-X). [RK].[AILMFV][LTKF]. 2 http://elm.eu.org/elms/elmPages/CLV_PCSK_SKI1_1.html
CLV_Separin_Fungi Separase cleavage site, best known in sister chromatid separation. Also involved in stabilizing the anaphase spindle and centriole disengagement. S[IVLMH]E[IVPFMLYAQR]GR. 4 http://elm.eu.org/elms/elmPages/CLV_Separin_Fungi.html
CLV_Separin_Metazoa Separase cleavage site, best known in sister chromatid separation. E[IMPVL][MLVP]R. 5 http://elm.eu.org/elms/elmPages/CLV_Separin_Metazoa.html
CLV_TASPASE1 Taspase1 is a threonine aspartase which was first identified as the protease responsible for processing the trithorax (MLL) type of histone methyltransferases. Q[MLVI]DG..[DE] 2 http://elm.eu.org/elms/elmPages/CLV_TASPASE1.html
DEG_APCC_DBOX_1 An RxxL-based motif that binds to the Cdh1 and Cdc20 components of APC/C thereby targeting the protein for destruction in a cell cycle dependent manner .R..L..[LIVM]. 11 http://elm.eu.org/elms/elmPages/DEG_APCC_DBOX_1.html
DEG_APCC_KENBOX_2 Motif conserving the exact sequence KEN that binds to the APC/C subunit Cdh1 causing the protein to be targeted for 26S proteasome mediated degradation. .KEN. 16 http://elm.eu.org/elms/elmPages/DEG_APCC_KENBOX_2.html
DEG_APCC_TPR_1 This short C-terminal motif is present in co-activators, the Doc1/APC10 subunit and some substrates of the APC/C and mediates direct binding to TPR-containing APC/C core subunits. .[ILM]R$ 22 http://elm.eu.org/elms/elmPages/DEG_APCC_TPR_1.html
DEG_COP1 COP1 binding motif. The ring finger protein COP1 is an E3 ubiquitin ligase that regulates plant light sensitive development and in mammals can target P53 for destruction. [DE][DE]...VP[DE] 4 http://elm.eu.org/elms/elmPages/DEG_COP1.html
DEG_CRL4_CDT2_1 This degron overlaps a PCNA interaction protein (PIP) box and is recognised by the CRL4<sup>Cdt2</sup> ubiquitin ligase in a PCNA- and chromatin-dependent manner. [NQ]{0,1}..[ILMV][ST][DEN][FY][FY].{2,3}[KR]{2,3}[^DE] 6 http://elm.eu.org/elms/elmPages/DEG_CRL4_CDT2_1.html
DEG_CRL4_CDT2_2 This degron, occurring in non-Vertebrates, overlaps a PCNA interaction protein (PIP) box and is recognised by the CRL4<sup>Cdt2</sup> ubiquitin ligase in a PCNA- and chromatin-dependent manner. [NQ]{0,1}..[ILMV]T[DEN][HMFY][FMY].{2,3}[KR]{2,3}[^DE] 1 http://elm.eu.org/elms/elmPages/DEG_CRL4_CDT2_2.html
DEG_MDM2_1 Motif found in p53 family members which confers binding to the N-terminal domain of MDM2. F...W..[LIV] 3 http://elm.eu.org/elms/elmPages/DEG_MDM2_1.html
DEG_Nend_UBRbox_4 N-terminal motif that initiates protein degradation by binding to the UBR-box of N-recognins. This N-degron variant comprises N-terminal Cys as destabilizing residue. ^M{0,1}(C). 8 http://elm.eu.org/elms/elmPages/DEG_Nend_UBRbox_4.html
DEG_ODPH_VHL_1 Oxygen dependent prolyl hydroxylation motif in the unstructured region of hypoxia-inducible factor protein and bound by the VHL ligand. [IL]A(P).{6,8}[FLIVM].[FLIVM] http://elm.eu.org/elms/elmPages/DEG_ODPH_VHL_1.html
DEG_SCF_COI1_1 This degron motif is present in JAZ transcriptional repressor proteins and binds to the COI1 F-box protein of the SCF E3 ubiquitin ligase in a jasmonate-dependent manner. ..[RK][RK].SL..F[FLM].[RK]R[HRK].[RK]. 9 http://elm.eu.org/elms/elmPages/DEG_SCF_COI1_1.html
DEG_SCF_FBW7_1 The TPxxS phospho-dependent degron binds the FBW7 F box proteins of the SCF (Skp1_Cullin-Fbox) complex. [LIVMP].{0,2}(T)P..([ST]) 6 http://elm.eu.org/elms/elmPages/DEG_SCF_FBW7_1.html
DEG_SCF_FBW7_2 The TPxxE phospho-dependent degron binds the FBW7 F box proteins of the SCF (Skp1_Cullin-Fbox) complex. [LIVMP].{0,2}(T)P..E 2 http://elm.eu.org/elms/elmPages/DEG_SCF_FBW7_2.html
DEG_SCF_SKP2-CKS1_1 Degradation motif recognised by a pre-assembled complex consisting of Skp2 (an F box protein of the SCF E3 ubiquitin ligase) and Cks1, which leads to ubiquitylation and subsequent proteosomal degradation. ..[DE].(T)P.K 3 http://elm.eu.org/elms/elmPages/DEG_SCF_SKP2-CKS1_1.html
DEG_SCF_TIR1_1 This degron motif is present in Aux/IAA transcriptional repressor proteins and binds to TIR1/AFB F-box proteins of the SCF E3 ubiquitin ligase in an auxin-dependent manner. .[VLIA][VLI]GWPP[VLI]...R. 24 http://elm.eu.org/elms/elmPages/DEG_SCF_TIR1_1.html
DEG_SCF_TRCP1_1 The DSGxxS phospho-dependent degron binds the F box protein of the SCF-betaTrCP1 complex. The degron is found in various proteins that function in regulation of cell state. D(S)G.{2,3}([ST]) 18 http://elm.eu.org/elms/elmPages/DEG_SCF_TRCP1_1.html
DEG_SIAH_1 The PxAxVxP peptide binds to the substrate-binding domain (SBD) of the Siah family members .P.A.V.P[^P] 9 http://elm.eu.org/elms/elmPages/DEG_SIAH_1.html
DOC_AGCK_PIF_1 The DOC_AGCK_PIF_1 motif contains a phosphorylatable serine/threonine residue that allows fine-tuning of the affinity of the motif for the PIF pocket, with the phosphorylated motif showing a higher affinity. F..[FWY][ST][FY] 10 http://elm.eu.org/elms/elmPages/DOC_AGCK_PIF_1.html
DOC_AGCK_PIF_2 In the DOC_AGCK_PIF_2 motif the phosphorylatable serine/threonine residue is replaced by an acidic aspartate or glutamate residue. F..[FWY][DE][FY] 5 http://elm.eu.org/elms/elmPages/DOC_AGCK_PIF_2.html
DOC_AGCK_PIF_3 The DOC_AGCK_PIF_3 variant consists only of the first two core aromatic residues preceding the phosphorylatable or acidic site in the other variants, and the latter of these two aromatic residues is the C-terminal residue of the kinase sequence. F..F$ http://elm.eu.org/elms/elmPages/DOC_AGCK_PIF_3.html
DOC_ANK_TNKS_1 The Tankyrase binding motif interacts with the ankyrin repeat domain region in Tankyrase-1 and Tankyrase-2 to facilitate the PARsylation of the target proteins. .R..[PGAV][DEIP]G. 17 http://elm.eu.org/elms/elmPages/DOC_ANK_TNKS_1.html
DOC_CKS1_1 Phospho-dependent motif that mediates docking of CDK substrates and regulators to cyclin-CDK-bound Cks1. [MPVLIFWYQ].(T)P.. 8 http://elm.eu.org/elms/elmPages/DOC_CKS1_1.html
DOC_CYCLIN_1 Substrate recognition site that interacts with cyclin and thereby increases phosphorylation by cyclin/cdk complexes. Predicted proteins should have a CDK phosphorylation site. Also used by cyclin/cdk inhibitors. [RK].L.{0,1}[FYLIVMP] 24 http://elm.eu.org/elms/elmPages/DOC_CYCLIN_1.html
DOC_MAPK_1 MAPK interacting molecules (e.g. MAPKKs, substrates, phosphatases) carry docking motif that help to regulate specific interaction in the MAPK cascade. The classic motif approximates (R/K)xxxx#x# where # is a hydrophobic residue. [KR]{0,2}[KR].{0,2}[KR].{2,4}[ILVM].[ILVF] 16 http://elm.eu.org/elms/elmPages/DOC_MAPK_1.html
DOC_MAPK_2 MAPK interacting molecules (e.g. MAPKKs, substrates, phosphatases) carry docking motif that help to regulate specific interaction in the MAPK cascade. F.FP 7 http://elm.eu.org/elms/elmPages/DOC_MAPK_2.html
DOC_PIKK_1 DOC_PIKK_1 motif is located in the C terminus of Nbs1 and its homologues and interacts with PIKK family members. [DEN][DEN].{2,3}[ILMVA][DEN][DEN]L 4 http://elm.eu.org/elms/elmPages/DOC_PIKK_1.html
DOC_PP1_RVXF_1 Protein phosphatase 1 catalytic subunit (PP1c) interacting motif binds targeting proteins that dock to the substrate for dephosphorylation. The motif defined is [RK]{0,1}[VI][^P][FW]. ..[RK].{0,1}[VIL][^P][FW]. 20 http://elm.eu.org/elms/elmPages/DOC_PP1_RVXF_1.html
DOC_PP1_SILK_1 Protein phosphatase 1 catalytic subunit (PP1c) interacting motif that often cooperates with and is located N-terminal to the RVXF motif to dock proteins to PP1c. .[GS]IL[KR][^DE] 14 http://elm.eu.org/elms/elmPages/DOC_PP1_SILK_1.html
DOC_PP2B_1 Calcineurin substrate docking site, leads to the effective dephosphorylation of serine/threonine phosphorylation sites. .P[^P]I[^P][IV][^P] 9 http://elm.eu.org/elms/elmPages/DOC_PP2B_1.html
DOC_PP2B_2 Docking motif in calcineurin substrates that binds at the interface of the catalytic CNA and regulatory CNB subunits. L.[LIVAPM]P 8 http://elm.eu.org/elms/elmPages/DOC_PP2B_2.html
DOC_SPAK_OSR1_1 SPAK/OSR1 kinase binding motif acts as a docking site which aids the interaction with their binding partners including the upstream activators and the phosphorylated substrates. RF[^P][IV]. 13 http://elm.eu.org/elms/elmPages/DOC_SPAK_OSR1_1.html
DOC_USP7_1 The USP7 NTD domain binding motif variant based on the MDM2 and P53 interactions. [PA][^P][^FYWIL]S[^P] 10 http://elm.eu.org/elms/elmPages/DOC_USP7_1.html
DOC_USP7_2 The USP7 NTD domain binding motif variant based on the EBV EBNA1 interaction. P.E[^P].S[^P] 1 http://elm.eu.org/elms/elmPages/DOC_USP7_2.html
DOC_WD40_RPTOR_TOS_1 The TOR pathway adaptor protein Raptor links the mTOR kinase to the TOS motif containing substrates 4E-BP1 and S6-beta kinases.<br /> Proteins with TOR motif (e.g. 4E-BP1, S6KB1) participate in the transcription mechanism. F[EDQS][MILV][ED][MILV]((.{0,1}[ED])|($)) 5 http://elm.eu.org/elms/elmPages/DOC_WD40_RPTOR_TOS_1.html
DOC_WW_Pin1_4 The Class IV WW domain interaction motif is recognised primarily by the Pin1 phosphorylation-dependent prolyl isomerase. ...([ST])P. 96 http://elm.eu.org/elms/elmPages/DOC_WW_Pin1_4.html
LIG_14-3-3_1 Mode 1 interacting phospho-motif for 14-3-3 proteins with key conservation RxxSxP. R.[^P]([ST])[^P]P 16 http://elm.eu.org/elms/elmPages/LIG_14-3-3_1.html
LIG_14-3-3_2 Longer mode 2 interacting phospho-motif for 14-3-3 proteins with key conservation RxxxS#p. R..[^P]([ST])[IVLM]. 7 http://elm.eu.org/elms/elmPages/LIG_14-3-3_2.html
LIG_14-3-3_3 Consensus derived from reported natural interactors which do not match the Mode 1 and Mode 2 ligands. [RHK][STALV].([ST]).[PESRDIFTQ] 22 http://elm.eu.org/elms/elmPages/LIG_14-3-3_3.html
LIG_Actin_RPEL_3 RPEL motif, present in proteins in several repeats, mediates binding to the hydrophobic cleft created by subdomains 1 and 3 of G-actin. [IL]..[^P][^P][^P][^P]R.....[IL]..[^P][^P][ILV][ILM] 13 http://elm.eu.org/elms/elmPages/LIG_Actin_RPEL_3.html
LIG_Actin_WH2_1 WH2 is a motif of variable length (16-19 amino acids) binding to the hydrophobic cleft formed by actin's subdomains 1 and 3. At the N-terminus it forms an alpha-helix followed by a flexible loop stabilised upon actin binding. R..[ILVMF][ILMVF][^P][^P][ILVM].{4,7}L(([KR].)|(NK))[VATI] 6 http://elm.eu.org/elms/elmPages/LIG_Actin_WH2_1.html
LIG_Actin_WH2_2 The WH2 motif is of variable length (16-19 amino acids) binding to the hydrophobic cleft formed by actin's subdomains 1 and 3. At the N-terminus it forms an alpha-helix followed by a flexible loop stabilised upon actin binding. [^R]..((.[ILMVF])|([ILMVF].))[^P][^P][ILVM].{4,7}L(([KR].)|(NK))[VATIGS] 13 http://elm.eu.org/elms/elmPages/LIG_Actin_WH2_2.html
LIG_AP2alpha_1 FxDxF motif responsible for the binding of accessory endocytic proteins to the appendage of the alpha-subunit of adaptor protein complex AP-2 F.D.F 11 http://elm.eu.org/elms/elmPages/LIG_AP2alpha_1.html
LIG_AP2alpha_2 DPF/W motif binds alpha and beta subunits of AP2 adaptor complex. DP[FW] 54 http://elm.eu.org/elms/elmPages/LIG_AP2alpha_2.html
LIG_APCC_Cbox_1 Motif in APC/C co-activators that mediates binding to the APC/C core, possibly the catalytic Apc2 subunit. This first variant defines the motif in APC/C co-activators from Eukaryotes except Fungi and Amoebozoa. [DE]R[YFH][ILFVM][PAG].R 2 http://elm.eu.org/elms/elmPages/LIG_APCC_Cbox_1.html
LIG_APCC_Cbox_2 Motif in APC/C co-activators that mediates binding to the APC/C core, possibly the catalytic Apc2 subunit. This second variant defines the motif in APC/C co-activators from Fungi and Amoebozoa. DR[YFH][ILFVM][PA].. 3 http://elm.eu.org/elms/elmPages/LIG_APCC_Cbox_2.html
LIG_AP_GAE_1 The acidic Phe motif mediates the interaction between a set of accessory proteins and the gamma-ear domain (GAE) of GGAs and AP-1. Proposed roles: in clathrin localization and assembly on TGN/endosome membranes and in traffic between the TGN and endosome. [DE][DES][DEGAS]F[SGAD][DEAP][LVIMFD] 11 http://elm.eu.org/elms/elmPages/LIG_AP_GAE_1.html
LIG_BIR_III_1 These IBMs are found in pro-apoptotic proteins and function in the abrogation of caspase inhibition by Inhibitor of Apoptosis Proteins (IAPs) in apoptotic cells. The motif binds specifically to type III BIR domains. ^M{0,1}A.P. 1 http://elm.eu.org/elms/elmPages/LIG_BIR_III_1.html
LIG_BIR_III_2 These IBMs are found at the N-terminal regions of caspase subunits where they mediate the inhibition of activated caspases by binding to conserved surface grooves on type III BIR domains of Inhibitor of Apoptosis Proteins (IAPs). DA.P. 3 http://elm.eu.org/elms/elmPages/LIG_BIR_III_2.html
LIG_BIR_III_3 These IBMs are found in arthropodal pro-apoptotic proteins and function in the abrogation of caspase inhibition by Inhibitor of Apoptosis Proteins (IAPs) in apoptotic cells. The motif binds specifically to type III BIR domains of arthropodal IAPs. ^M{0,1}A.[AP]. 4 http://elm.eu.org/elms/elmPages/LIG_BIR_III_3.html
LIG_BIR_III_4 These IBMs are found in the N-terminal regions of arthropodal caspase subunits where they mediate the inhibition of activated caspases by binding to conserved surface grooves on type III BIR domains of Inhibitor of Apoptosis Proteins (IAPs). DA.G. 2 http://elm.eu.org/elms/elmPages/LIG_BIR_III_4.html
LIG_BRCT_BRCA1_1 Phosphopeptide motif which directly interacts with the BRCT (carboxy-terminal) domain of the Breast Cancer Gene BRCA1 with low affinity .(S)..F 5 http://elm.eu.org/elms/elmPages/LIG_BRCT_BRCA1_1.html
LIG_BRCT_BRCA1_2 Phosphopeptide motif which directly interacts with the BRCT (carboxy-terminal) domain of the Breast Cancer Gene BRCA1 with high affinity. .(S)..F.K 1 http://elm.eu.org/elms/elmPages/LIG_BRCT_BRCA1_2.html
LIG_BRCT_MDC1_1 Phosphopeptide motif which is specifically recognized by the BRCT (Carboxy-terminal) repeats of MDC1 .(S)..Y$ http://elm.eu.org/elms/elmPages/LIG_BRCT_MDC1_1.html
LIG_CaMK_CASK_1 Motif that mediates binding to the calmodulin-dependent protein kinase (CaMK) domain of the peripheral plasma membrane protein CASK/Lin2. ((SP)|([ED].{0,1}))[IV]W[IVL].R 6 http://elm.eu.org/elms/elmPages/LIG_CaMK_CASK_1.html
LIG_CAP-Gly_1 Short, acidic and aromatic carboxy terminal sequence found in a small group of microtubule-associated-proteins. The EEY/F$ motif is highly conserved and so far limited to a few known proteins, alpha-tubulin, EB proteins and CLIP170. [ED].{0,2}[ED].{0,2}[EDQ].{0,1}[YF]$ 3 http://elm.eu.org/elms/elmPages/LIG_CAP-Gly_1.html
LIG_CAP-Gly_2 Short, partly aromatic carboxy terminal sequence found in the SLAIN group of microtubule-associated-proteins. .W[RK][DE]GCY$ 1 http://elm.eu.org/elms/elmPages/LIG_CAP-Gly_2.html
LIG_Clathr_ClatBox_1 Clathrin box motif found on cargo adaptor proteins, it interacts with the beta propeller structure located at the N-terminus of Clathrin heavy chain. L[IVLMF].[IVLMF][DE] 18 http://elm.eu.org/elms/elmPages/LIG_Clathr_ClatBox_1.html
LIG_Clathr_ClatBox_2 Clathrin box motif found on cargo adaptor proteins, it mediates binding to the N-terminal beta propeller of clathrin heavy chain. Also called W box, it is found in the central region of Amphiphysins where it coexists with a "classical" clathrin box. .[NP]W[DES].W 2 http://elm.eu.org/elms/elmPages/LIG_Clathr_ClatBox_2.html
LIG_CORNRBOX The corepressor nuclear receptor box motif confers binding to nuclear receptors. L[^P]{2,2}[HI]I[^P]{2,2}[IAV][IL] 4 http://elm.eu.org/elms/elmPages/LIG_CORNRBOX.html
LIG_CtBP_PxDLS_1 The PxDLS motif interacts with the NAD-dependent repressor CtBP proteins. (P[LVIPME][DENS][LM][VASTRG])|(G[LVIPME][DENS][LM][VASTRG]((K)|(.[KR]))) 32 http://elm.eu.org/elms/elmPages/LIG_CtBP_PxDLS_1.html
LIG_Dynein_DLC8_1 The [KR]xTQT motif interacts with the common target-accepting grooves of 8kDa Dynein Light Chain dimer. [^P].[KR].TQT 9 http://elm.eu.org/elms/elmPages/LIG_Dynein_DLC8_1.html
LIG_EABR_CEP55_1 This proline-rich motif binds to the EABR domain of Cep55 and is involved in both cytokinesis of somatic cells and intercellular bridge formation in differentiating germ cells. .A.GPP.{2,3}Y. 6 http://elm.eu.org/elms/elmPages/LIG_EABR_CEP55_1.html
LIG_EF_ALG2_ABM_1 This isoform-specific ALG-2-binding motif binds to the EF hand domains of the proapoptotic Ca2+-binding ALG-2 protein in a Ca2+-dependent manner. P[PG]{0,1}YP.{1,6}Y[QS]{0,1}P 9 http://elm.eu.org/elms/elmPages/LIG_EF_ALG2_ABM_1.html
LIG_EF_ALG2_ABM_2 This isoform-unspecific ALG-2-binding motif binds to the EF hand domains of the proapoptotic Ca2+-binding ALG-2 protein in a Ca2+-dependent manner. P.P.{0,1}GF 3 http://elm.eu.org/elms/elmPages/LIG_EF_ALG2_ABM_2.html
LIG_EH_1 NPF motif interacting with EH domains, usually during regulation of endocytotic processes .NPF. 88 http://elm.eu.org/elms/elmPages/LIG_EH_1.html
LIG_EH1_1 The engrailed homology domain 1 motif is found in homeodomain containing active repressors and other transcription families, and allows for the recruitment of Groucho/TLE corepressors. .[FYH].[IVM][^WFYP][^WFYP][ILM][ILMV]. 11 http://elm.eu.org/elms/elmPages/LIG_EH1_1.html
LIG_eIF4E_1 Motif binding to the dorsal surface of eIF4E. Y....L[VILMF] 13 http://elm.eu.org/elms/elmPages/LIG_eIF4E_1.html
LIG_eIF4E_2 Atypical variant of eIF4E motif. Y.PP.[ILMV]R 5 http://elm.eu.org/elms/elmPages/LIG_eIF4E_2.html
LIG_EVH1_1 Proline-rich motif binding to signal transduction class I EVH1 domains. ([FYWL]P.PP)|([FYWL]PP[ALIVTFY]P) 19 http://elm.eu.org/elms/elmPages/LIG_EVH1_1.html
LIG_EVH1_2 Proline-rich motif binding to signal transduction class II EVH1 domains. PP..F 8 http://elm.eu.org/elms/elmPages/LIG_EVH1_2.html
LIG_EVH1_3 A proline-rich motif binding to EVH1/WH1 domains of WASP and N-WASP proteins. [FY].[FW].....[LMVIF]P.P[DE] 3 http://elm.eu.org/elms/elmPages/LIG_EVH1_3.html
LIG_FAT_LD_1 The paxillin LD motif is recognized by FAK and other focal adhesion proteins mainly involved in cytoskeletal regulation [LV][DE][^P][LM][LM][^P][^P]L[^P] 4 http://elm.eu.org/elms/elmPages/LIG_FAT_LD_1.html
LIG_FHA_1 Phosphothreonine motif binding a subset of FHA domains that show a preference for a large aliphatic amino acid at the pT+3 position. ..(T)..[ILV]. 5 http://elm.eu.org/elms/elmPages/LIG_FHA_1.html
LIG_FHA_2 Phosphothreonine motif binding a subset of FHA domains that have a preference for an acidic amino acid at the pT+3 position. ..(T)..[DE]. 6 http://elm.eu.org/elms/elmPages/LIG_FHA_2.html
LIG_GLEBS_BUB3_1 Gle2-binding-sequence motif [EN][FYLW][NSQ].EE[ILMVF][^P][LIVMFA] 5 http://elm.eu.org/elms/elmPages/LIG_GLEBS_BUB3_1.html
LIG_GYF LIG_GYF is a proline-rich sequence specifically recognized by GYF domains [QHR].{0,1}P[PL]PP[GS]H[RH] 3 http://elm.eu.org/elms/elmPages/LIG_GYF.html
LIG_HCF-1_HBM_1 The DHxY Host Cell Factor-1 binding motif (HBM) interacts with the N-terminal kelch propeller domain of the cell cycle regulator HCF-1 [DE]H.Y 17 http://elm.eu.org/elms/elmPages/LIG_HCF-1_HBM_1.html
LIG_HOMEOBOX The YPWM motif confers binding to the PBX homeobox domain [FY][DEP]WM 16 http://elm.eu.org/elms/elmPages/LIG_HOMEOBOX.html
LIG_HP1_1 Ligand to interface formed by dimerisation of two chromoshadow domains in HP1 proteins. P[MVLIRWY]V[MVLIAS][LM] 9 http://elm.eu.org/elms/elmPages/LIG_HP1_1.html
LIG_Integrin_isoDGR_1 NGR motif is present in proteins of extracellular matrix which upon deamidation forms a biologically active isoDGR motif that binds to various members of integrin family. NGR 8 http://elm.eu.org/elms/elmPages/LIG_Integrin_isoDGR_1.html
LIG_IQ Calmodulin binding helical peptide motif ...[SACLIVTM]..[ILVMFCT]Q.{3,3}[RK].{4,5}[RKQ].. 40 http://elm.eu.org/elms/elmPages/LIG_IQ.html
LIG_KEPE_1 Short length variant of the KEPE motif which is found superposed on some SUMO sites [VILMFT]K.EP.[DE] 5 http://elm.eu.org/elms/elmPages/LIG_KEPE_1.html
LIG_KEPE_2 Medium length variant of the KEPE motif which is found superposed on some SUMO sites [VILMFT]K.EP.{2,3}[DE] 12 http://elm.eu.org/elms/elmPages/LIG_KEPE_2.html
LIG_KEPE_3 Long length variant of the KEPE motif which is found superposed on some SUMO sites [VILMFT]K.EP....[DE] 4 http://elm.eu.org/elms/elmPages/LIG_KEPE_3.html
LIG_LIR_Apic_2 Apicomplexa-specific variant of the canonical LIR motif that binds to Atg8 protein family members to mediate processes involved in autophagy. [EDST].{0,2}[WFY]..P 1 http://elm.eu.org/elms/elmPages/LIG_LIR_Apic_2.html
LIG_LIR_Gen_1 Canonical LIR motif that binds to Atg8 protein family members to mediate processes involved in autophagy. [EDST].{0,2}[WFY]..[ILV] 21 http://elm.eu.org/elms/elmPages/LIG_LIR_Gen_1.html
LIG_LIR_LC3C_4 Non-canonical variant of the LIR motif that binds to Atg8 protein family members to mediate processes involved in autophagy. [EDST].{0,2}LVV 1 http://elm.eu.org/elms/elmPages/LIG_LIR_LC3C_4.html
LIG_LIR_Nem_3 Nematode-specific variant of the canonical LIR motif that binds to Atg8 protein family members to mediate processes involved in autophagy. [EDST].{0,2}[WFY]..[ILVFY] http://elm.eu.org/elms/elmPages/LIG_LIR_Nem_3.html
LIG_LYPXL_L_2 The long version of the LYPxL motif binds the V-domain of Alix, a protein involved in endosomal sorting. [LM]YP...[LI][^P][^P][LI] 3 http://elm.eu.org/elms/elmPages/LIG_LYPXL_L_2.html
LIG_LYPXL_S_1 The short version of the LYPxL motif binds the V-domain of Alix, a protein involved in endosomal sorting. [LM]YP.[LI] 16 http://elm.eu.org/elms/elmPages/LIG_LYPXL_S_1.html
LIG_MAD2 Mad2 binding motif [KR][IV][LV].....P 6 http://elm.eu.org/elms/elmPages/LIG_MAD2.html
LIG_MYND_1 PxLxP motif is recognized by a subset of MYND domain containing proteins. P.L.P http://elm.eu.org/elms/elmPages/LIG_MYND_1.html
LIG_MYND_2 Motif that mediates the interaction between MYND domain of AML1/ETO and co-repressors SMRT and N-CoR. PP.LI 3 http://elm.eu.org/elms/elmPages/LIG_MYND_2.html
LIG_MYND_3 A variant MYND binding motif found in the HSP90 co-chaperones p23 and FKBP38 interacting with PHD2 MYND domain. [LMV]P.LE 2 http://elm.eu.org/elms/elmPages/LIG_MYND_3.html
LIG_NBox_RRM_1 Amino terminal region on Far Upstream Element (FUSE) binding protein (<a style="white-space:nowrap;" href="http://www.uniprot.org/uniprot/Q96AE4" target="_blank"><img src="/media/uniprot.ico.png">Q96AE4</a>), which mediates the interaction with FIR in order to recruit FIR (<a style="white-space:nowrap;" href="http://www.uniprot.org/uniprot/Q9UHX1" target="_blank"><img src="/media/uniprot.ico.png">Q9UHX1</a>) to FUSE DNA. F..A[ILV]..A..[ILV] http://elm.eu.org/elms/elmPages/LIG_NBox_RRM_1.html
LIG_NRBOX The nuclear receptor box motif (LXXLL) confers binding to nuclear receptors. [^P]L[^P][^P]LL[^P] 24 http://elm.eu.org/elms/elmPages/LIG_NRBOX.html
LIG_OCRL_FandH_1 The F and H motif describes a 10-13-mer peptide sequence determined by a highly conserved phenylalanine and histidine residue surrounded by hydrophobic amino acids. A complex of ASH and RhoGAP-like domain binds this motif within a hydrophobic pocket. .F[^P][^P][KRIL]H[^P][^P][YLMFH][^P]... 3 http://elm.eu.org/elms/elmPages/LIG_OCRL_FandH_1.html
LIG_PAM2_1 Peptide ligand motif that directly binds to the MLLE/PABC domain found in poly(A)-binding proteins and HYD E3 ubiquitin ligases, mainly via a common central core region and a complementary N-terminal region. ..[LFP][NS][PIVTAFL].A..(([FY].[PYLF])|(W..)). 22 http://elm.eu.org/elms/elmPages/LIG_PAM2_1.html
LIG_PAM2_2 Peptide ligand motif that directly binds to the MLLE/PABC domain found in poly(A)-binding proteins and HYD E3 ubiquitin ligases, mainly via a common central core region and a complementary C-terminal region. ((WPP)|([FL][PV][APQ]))EF.PG.PWKG. 4 http://elm.eu.org/elms/elmPages/LIG_PAM2_2.html
LIG_PCNA_PIPBox_1 The PCNA binding PIP box motif is found in proteins involved in DNA replication, repair and cell cycle control. ((^.{0,3})|(Q)).[^FHWY][ILM][^P][^FHILVWYP][HFM][FMY].. 18 http://elm.eu.org/elms/elmPages/LIG_PCNA_PIPBox_1.html
LIG_PDZ_Class_1 The C-terminal class 1 PDZ-binding motif is classically represented by a pattern like (ST)X(VIL)* ...[ST].[ACVILF]$ 48 http://elm.eu.org/elms/elmPages/LIG_PDZ_Class_1.html
LIG_PDZ_Class_2 The C-terminal class 2 PDZ-binding motif is classically represented by a pattern such as (VYF)X(VIL)* ...[VLIFY].[ACVILF]$ 13 http://elm.eu.org/elms/elmPages/LIG_PDZ_Class_2.html
LIG_PDZ_Class_3 The C-terminal class 3 PDZ-binding motif is classically represented by a pattern such as (DE)X(VIL)* ...[DE].[ACVILF]$ 1 http://elm.eu.org/elms/elmPages/LIG_PDZ_Class_3.html
LIG_PTAP_UEV_1 PTAP motif binds the N-terminal UEV domain of Tsg101. .P[TS]AP. 25 http://elm.eu.org/elms/elmPages/LIG_PTAP_UEV_1.html
LIG_PTB_Apo_2 These phosphorylation-independent motifs bind to Dab-like PTB domains. Binding is not driven by contacts at the 0 or FY position, but instead is dependent upon the large number of hydrophobic and hydrogen bond contacts between motif and domain. (.[^P].NP.[FY].)|(.[ILVMFY].N..[FY].) 19 http://elm.eu.org/elms/elmPages/LIG_PTB_Apo_2.html
LIG_PTB_Phospho_1 This phosphorylation-dependent motif binds to Shc-like and IRS-like PTB domains. The pTyr is positioned within a highly basic-charged anchoring pocket. A hydrophobic residue -5 (compared to pY) increases the affinity of the interaction. (.[^P].NP.(Y))|(.[ILVMFY].N..(Y)) 17 http://elm.eu.org/elms/elmPages/LIG_PTB_Phospho_1.html
LIG_Rb_LxCxE_1 Interacts with the Retinoblastoma protein [LI].C.[DE] 32 http://elm.eu.org/elms/elmPages/LIG_Rb_LxCxE_1.html
LIG_Rb_pABgroove_1 The LxxLFD motif binds in a deep groove between pocket A and pocket B of the Retinoblastoma protein ..[LIMV]..[LM][FY]D. 3 http://elm.eu.org/elms/elmPages/LIG_Rb_pABgroove_1.html
LIG_RGD The RGD motif can be found in many proteins of the extracellular matrix and it is recognized by different members of the integrin family. The structure of the tenth type III module of fibronectin has shown that the RGD motif lies on an exposed flexible lo RGD 21 http://elm.eu.org/elms/elmPages/LIG_RGD.html
LIG_RRM_PRI_1 The PTB RRM2 Interacting (PRI) motif is found in some splicing regulators, possibly only in the chordate lineage. As part of splicing complex regulation, it interacts with the 2nd RNA binding domain (RRM) of PTB, the polypyrimidine tract binding protein. .[ILVM]LG..P. 3 http://elm.eu.org/elms/elmPages/LIG_RRM_PRI_1.html
LIG_SH2_GRB2 GRB2-like Src Homology 2 (SH2) domains binding motif. (Y).N. 16 http://elm.eu.org/elms/elmPages/LIG_SH2_GRB2.html
LIG_SH2_PTP2 SH-PTP2 and phospholipase C-gamma Src Homology 2 (SH2) domains binding motif. (Y)[IV].[VILP] 1 http://elm.eu.org/elms/elmPages/LIG_SH2_PTP2.html
LIG_SH2_SRC Src-family Src Homology 2 (SH2) domains binding motif. (Y)[QDEVAIL][DENPYHI][IPVGAHS] 23 http://elm.eu.org/elms/elmPages/LIG_SH2_SRC.html
LIG_SH2_STAT3 YXXQ motif found in the cytoplasmic region of cytokine receptors that bind STAT3 SH2 domain. (Y)..Q 9 http://elm.eu.org/elms/elmPages/LIG_SH2_STAT3.html
LIG_SH2_STAT5 STAT5 Src Homology 2 (SH2) domain binding motif. (Y)[VLTFIC].. 19 http://elm.eu.org/elms/elmPages/LIG_SH2_STAT5.html
LIG_SH2_STAT6 STAT6 Src Homology 2 (SH2) domain binding motif. G(Y)[KQ].F 1 http://elm.eu.org/elms/elmPages/LIG_SH2_STAT6.html
LIG_SH3_1 This is the motif recognized by class I SH3 domains [RKY]..P..P 5 http://elm.eu.org/elms/elmPages/LIG_SH3_1.html
LIG_SH3_2 This is the motif recognized by class II SH3 domains P..P.[KR] 19 http://elm.eu.org/elms/elmPages/LIG_SH3_2.html
LIG_SH3_3 This is the motif recognized by those SH3 domains with a non-canonical class I recognition specificity ...[PV]..P 16 http://elm.eu.org/elms/elmPages/LIG_SH3_3.html
LIG_SH3_4 This is the motif recognized by those SH3 domains with a non-canonical class II recognition specificity KP..[QK]... 2 http://elm.eu.org/elms/elmPages/LIG_SH3_4.html
LIG_SH3_5 PXXDY motif recognized by some SH3 domains P..DY 3 http://elm.eu.org/elms/elmPages/LIG_SH3_5.html
LIG_Sin3_1 Motif interacts with PAH2 domain in the Sin3 scaffold protein. [LIV]..[LM]L.AA.[FY][LI] 4 http://elm.eu.org/elms/elmPages/LIG_Sin3_1.html
LIG_Sin3_2 Motif interacts with PAH2 domain in the Sin3 scaffold protein (sp-1 like). [FHYM].A[AV].[VAC]L[MV].[MI] 3 http://elm.eu.org/elms/elmPages/LIG_Sin3_2.html
LIG_Sin3_3 Motif interacts with PAH2 domain in the Sin3 scaffold protein (not mad or sp-1 like). [FA].[LA][LV][LVI]..[AM] 2 http://elm.eu.org/elms/elmPages/LIG_Sin3_3.html
LIG_SPRY_1 Peptide motif binding to the members of the SSB (or SPSB) family (SPRY domain- and SOCS box-containing protein) [ED][LIV]NNN[^P] http://elm.eu.org/elms/elmPages/LIG_SPRY_1.html
LIG_SUFU_1 A hydrophobic motif in GLI transcription factors required for binding to SUFU protein, which inhibits their activity and hence negatively regulates hedgehog signalling. [SV][CY]GH[LIF][LAST][GAIV]. 5 http://elm.eu.org/elms/elmPages/LIG_SUFU_1.html
LIG_SUMO_SBM_1 Motif that mediates binding to SUMO proteins non-covalently. [ILV](.[ILV]|[ILV]|[ILV].)[ILV][STDE]{1,10} 39 http://elm.eu.org/elms/elmPages/LIG_SUMO_SBM_1.html
LIG_SUMO_SBM_2 Inverted version of LIG_SUMO_SBM_1 that mediates binding to SUMO proteins non-covalently. [STDE]{1,10}[ILV](.[ILV]|[ILV]|[ILV].)[ILV] 8 http://elm.eu.org/elms/elmPages/LIG_SUMO_SBM_2.html
LIG_SxIP_EBH_1 SxIP motifs bind to EBH domains. ([KR][^ED]{0,5}[ST].IP[^ED]{5,5})|([^ED]{5,5}[ST].IP[^ED]{0,5}[KR]) 9 http://elm.eu.org/elms/elmPages/LIG_SxIP_EBH_1.html
LIG_TPR Ligands of the TPR (tetratricopeptide repeat motif) domains are EEVD motifs, C-terminal sequences highly conserved in all eukaryotic members of the Hsp70 and Hsp90 families. EEVD$ http://elm.eu.org/elms/elmPages/LIG_TPR.html
LIG_TRAF2_1 Major TRAF2-binding consensus motif. Members of the tumor necrosis factor receptor (TNFR) superfamily initiate intracellular signaling by recruiting the C-domain of the TNFR-associated factors (TRAFs) through their cytoplasmic tails. [PSAT].[QE]E 14 http://elm.eu.org/elms/elmPages/LIG_TRAF2_1.html
LIG_TRAF2_2 Minor TRAF2-binding consensus motif. Members of the tumor necrosis factor receptor (TNFR) superfamily initiate intracellular signaling by recruiting the C-domain of the TNFR-associated factors (TRAFs) through their cytoplasmic tails. P.Q..D 1 http://elm.eu.org/elms/elmPages/LIG_TRAF2_2.html
LIG_TRAF6 TRAF6 binding site. Members of the tumor necrosis factor receptor (TNFR) superfamily initiate intracellular signaling by recruiting the C-domain of the TNFR-associated factors (TRAFs) through their cytoplasmatic tails. ..P.E..[FYWHDE]. 20 http://elm.eu.org/elms/elmPages/LIG_TRAF6.html
LIG_TRFH_1 TRF1 and TRF2 both bind to another shelterin protein: TIN2. The TRF1-TIN2 interaction was mediated by a short motif in the N-Ter of TIN2. TIN2 connects TRF1 to TRF2; this link contributes to the stabilization of TRF2 on telomeres. [FY].L.P 3 http://elm.eu.org/elms/elmPages/LIG_TRFH_1.html
LIG_TYR_ITAM ITAM (immunoreceptor tyrosine-based activatory motif).<br /> ITAM consists of partially conserved short sequence of amino acid found in the cytoplasmatic tail of antigen and Fc receptors. [DEN]..(Y)..[LI].{6,12}(Y)..[LI] 7 http://elm.eu.org/elms/elmPages/LIG_TYR_ITAM.html
LIG_TYR_ITIM ITIM (immunoreceptor tyrosine-based inhibitory motif). Phosphorylation of the ITIM motif, found in the cytoplasmic tail of some inhibitory receptors (KIRs) that bind MHC Class I, leads to the recruitment and activation of a protein tyrosine phosphatase. [ILV].(Y)..[ILV] http://elm.eu.org/elms/elmPages/LIG_TYR_ITIM.html
LIG_TYR_ITSM ITSM (immunoreceptor tyrosine-based switch motif). This motif is present in the cytoplasmic region of the CD150 subfamily within the CD2 family and it enables these receptors to bind to and to be regulated by SH2 adaptor molecules, as SH2DIA. ..T.(Y)..[IV] 12 http://elm.eu.org/elms/elmPages/LIG_TYR_ITSM.html
LIG_ULM_U2AF65_1 Pattern encompassing the ULMs in SF1 and SAP155 which bind to the UHM of U2AF65 [KR]{1,4}[KR].[KR]W. 5 http://elm.eu.org/elms/elmPages/LIG_ULM_U2AF65_1.html
LIG_WD40_WDR5_1 This WDR5-binding motif binds between blades 5 and 6 of the WD40 repeat domain of WDR5, opposite of the Win motif-binding site, to mediate assembly of histone methylation complexes. [ED].{0,3}[VI]D[VI] 1 http://elm.eu.org/elms/elmPages/LIG_WD40_WDR5_1.html
LIG_WD40_WDR5_2 Fungi-specific variant of the WDR5-binding motif that recruits RbBP5 to a cleft between blades 5 and 6 of the WD40 repeat domain of WDR5, opposite of the Win motif-binding site, to mediate assembly of histone methylation complexes. [EDSTY].{0,4}[VIPLA][TSDEKR][ILVA] 2 http://elm.eu.org/elms/elmPages/LIG_WD40_WDR5_2.html
LIG_WD40_WDR5_WIN_1 Known as the Win (WDR5 interaction) motif, this peptide binds to the central tunnel of the WD40 repeat domain of WDR5 to mediate assembly of histone methylation complexes. [HN].[HNST]G[SCA]AR[STAC][EQ][GPVILM][YFHKRQN][YHLIVMATS] 6 http://elm.eu.org/elms/elmPages/LIG_WD40_WDR5_WIN_1.html
LIG_WD40_WDR5_WIN_2 Generalised metazoan variant of the Win (WDR5 interaction) motif, which in Vertebrates binds to the central tunnel of the WD40 repeat domain of WDR5 to mediate assembly of histone methylation complexes. [HNCSVI]..[GDE][STCA][AGVS]R[STCA][EQR][GPLAV] 3 http://elm.eu.org/elms/elmPages/LIG_WD40_WDR5_WIN_2.html
LIG_WD40_WDR5_WIN_3 Generalised fungal variant of the Win (WDR5 interaction) motif, which in Vertebrates binds to the central tunnel of the WD40 repeat domain of WDR5 to mediate assembly of histone methylation complexes. [HNSTE].[TSQN]P{0,1}GS{0,1}[SCA][AFWH][KR][TAS][DEQ][GP][RKYFIVAMW]..[IVM] http://elm.eu.org/elms/elmPages/LIG_WD40_WDR5_WIN_3.html
LIG_WH1 LIG_WH1 is the WIP sequence motif binding to the WH1 domains of WASP and N-WASP. ES[RK][FY].F[HR][PST][IVLM][DES][DE] http://elm.eu.org/elms/elmPages/LIG_WH1.html
LIG_WRPW_1 The WRPW motif mediates recruitment of transcriptional co-repressors of the Groucho/transducin-like enhancer-of-split (TLE) family. LIG_WRPW_1 is based on the C-terminus located motifs found in the Hairy and Runt family proteins. [WFY]RP[WFY].{0,7}$ 95 http://elm.eu.org/elms/elmPages/LIG_WRPW_1.html
LIG_WRPW_2 The WRPW motif mediates recruitment of transcriptional co-repressors of the Groucho/transducin-like enhancer-of-split (TLE) family. LIG_WRPW_2 is not restricted to the C-terminus (in contrast to LIG_WRPW_1). [WFY][KR]P[WFY] 2 http://elm.eu.org/elms/elmPages/LIG_WRPW_2.html
LIG_WW_1 PPXY is the motif recognized by WW domains of Group I PP.Y 28 http://elm.eu.org/elms/elmPages/LIG_WW_1.html
LIG_WW_2 PPLP is the motif recognized by WW domains of Group II PPLP 3 http://elm.eu.org/elms/elmPages/LIG_WW_2.html
LIG_WW_3 WW domain of group III binding motif .PPR. 1 http://elm.eu.org/elms/elmPages/LIG_WW_3.html
MOD_ASX_betaOH_EGF ASX hydroxylation of some EGF domains. C.([DN]).{4,4}[FY].C.C 6 http://elm.eu.org/elms/elmPages/MOD_ASX_betaOH_EGF.html
MOD_CAAXbox Generic CAAX box prenylation motif (C)[^DENQ][LIVM].$ 2 http://elm.eu.org/elms/elmPages/MOD_CAAXbox.html
MOD_CDK_1 Substrate motif for phosphorylation by CDK ...([ST])P.[KR] 11 http://elm.eu.org/elms/elmPages/MOD_CDK_1.html
MOD_CK1_1 CK1 phosphorylation site S..([ST])... 2 http://elm.eu.org/elms/elmPages/MOD_CK1_1.html
MOD_CK2_1 CK2 phosphorylation site ...([ST])..E 10 http://elm.eu.org/elms/elmPages/MOD_CK2_1.html
MOD_CMANNOS Motif for attachment of a mannosyl residue to a tryptophan (W)..W 24 http://elm.eu.org/elms/elmPages/MOD_CMANNOS.html
MOD_GlcNHglycan Glycosaminoglycan attachment site [ED]{0,3}.(S)[GA]. 6 http://elm.eu.org/elms/elmPages/MOD_GlcNHglycan.html
MOD_GSK3_1 GSK3 phosphorylation recognition site ...([ST])...[ST] 22 http://elm.eu.org/elms/elmPages/MOD_GSK3_1.html
MOD_LATS_1 The LATS phosphorylation motif is recognised by the LATS kinases for Ser/Thr phosphorylation. Substrates are often found toward the end of the Hippo signalling pathway. H.[KR]..([ST])[^P] 23 http://elm.eu.org/elms/elmPages/MOD_LATS_1.html
MOD_NEK2_1 NEK2 phosphorylation motif with preferred Phe, Leu or Met in the -3 position to compensate for less favorable residues in the +1 and +2 position. [FLM][^P][^P]([ST])[^DEP][^DE] 3 http://elm.eu.org/elms/elmPages/MOD_NEK2_1.html
MOD_N-GLC_1 Generic motif for N-glycosylation. It was shown that Trp, Asp, and Glu are uncommon before the Ser/Thr position (<a href="http://www.ncbi.nlm.nih.gov/pubmed/8626433" title="The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency.">Shakin-Eshleman,1996</a>). Efficient glycosylation usually occurs when ~60 residues or more separate the glycosylation acceptor site from the C-terminus. .(N)[^P][ST].. 156 http://elm.eu.org/elms/elmPages/MOD_N-GLC_1.html
MOD_N-GLC_2 Atipical motif for N-glycosylation site. Examples are Human CD69, which is uniquely glycosylated at typical (Asn-X-Ser/Thr) and atypical (Asn-X-Cys) motifs, beta protein C (N)[^P]C 5 http://elm.eu.org/elms/elmPages/MOD_N-GLC_2.html
MOD_NMyristoyl Generic motif for N-Myristoylation site. ^M{0,1}(G)[^EDRKHPFYW]..[STAGCN][^P] 48 http://elm.eu.org/elms/elmPages/MOD_NMyristoyl.html
MOD_OFUCOSY Site for attachment of a fucose residue to a serine. C.{3,5}([ST])C 4 http://elm.eu.org/elms/elmPages/MOD_OFUCOSY.html
MOD_OGLYCOS Site for attachment of a glucose residue to a serine. C.(S).PC 2 http://elm.eu.org/elms/elmPages/MOD_OGLYCOS.html
MOD_PIKK_1 (ST)Q motif which is phosphorylated by PIKK family members. ...([ST])Q.. 30 http://elm.eu.org/elms/elmPages/MOD_PIKK_1.html
MOD_PK_1 Phosphorylase kinase phosphorylation site [RK]..(S)[VI].. 1 http://elm.eu.org/elms/elmPages/MOD_PK_1.html
MOD_PKA_1 Main preference for PKA-type AGC kinase phosphorylation. [RK][RK].([ST])[^P].. 25 http://elm.eu.org/elms/elmPages/MOD_PKA_1.html
MOD_PKA_2 Secondary preference for PKA-type AGC kinase phosphorylation. .R.([ST])[^P].. 28 http://elm.eu.org/elms/elmPages/MOD_PKA_2.html
MOD_PKB_1 PKB Phosphorylation site R.R..([ST])[^P].. 20 http://elm.eu.org/elms/elmPages/MOD_PKB_1.html
MOD_PLK Site phosphorylated by the Polo-like kinase. .[DE].([ST])[ILFWMVA].. 2 http://elm.eu.org/elms/elmPages/MOD_PLK.html
MOD_ProDKin_1 Proline-Directed Kinase (e.g. MAPK) phosphorylation site in higher eukaryotes. ...([ST])P.. 36 http://elm.eu.org/elms/elmPages/MOD_ProDKin_1.html
MOD_SPalmitoyl_2 Class 2 Palmitoylation motif G(C)M[GS][CL][KP]C 2 http://elm.eu.org/elms/elmPages/MOD_SPalmitoyl_2.html
MOD_SPalmitoyl_4 Class 4 palmitoylation motif ^M{0,1}G(C)..S[AKS] 6 http://elm.eu.org/elms/elmPages/MOD_SPalmitoyl_4.html
MOD_SUMO Motif recognised for modification by SUMO-1 [VILMAFP](K).E 45 http://elm.eu.org/elms/elmPages/MOD_SUMO.html
MOD_TYR_CSK Members of the non-receptor tyrosine kinase Csk family phosphorylate the C-terminal tyrosine residues of the Src family. [TAD][EA].Q(Y)[QE].[GQA][PEDLS] 12 http://elm.eu.org/elms/elmPages/MOD_TYR_CSK.html
MOD_TYR_DYR The kinase activity of the DYRK (dual specificity kinase) is dependent on the autophosphorylation of the YXY motif in the activation loop. ..[RKTC][IVL]Y[TQHS](Y)[IL]QSR 9 http://elm.eu.org/elms/elmPages/MOD_TYR_DYR.html
MOD_WntLipid Palmitoylation site in WNT signalling proteins that is required for correct processing in the endoplasmic reticulum. [ETA](C)[QERK]..F...RWNC[ST] 1 http://elm.eu.org/elms/elmPages/MOD_WntLipid.html
TRG_AP2beta_CARGO_1 AP-2 beta appendage platform subdomain (top surface) binding motif used in targeting cargo for internalisation. [DE].{1,2}F[^P][^P][FL][^P][^P][^P]R 4 http://elm.eu.org/elms/elmPages/TRG_AP2beta_CARGO_1.html
TRG_Cilium_Arf4_1 The VxPx motif is located in the cytoplasmatic tails of vesicular cargoes. It allows the interaction with proteins that permit the vesicle budding from the trans-Golgi-network and its posterior transport to the plasma membrane of the cilia. QV.P.$ 1 http://elm.eu.org/elms/elmPages/TRG_Cilium_Arf4_1.html
TRG_Cilium_RVxP_2 The VxPx motif is located in the cytoplasmatic tails of vesicular cargoes. It allows the interaction with proteins that permit the vesicle budding from the trans-Golgi-network and its posterior transport to the plasma membrane of the cilia RV.P. 2 http://elm.eu.org/elms/elmPages/TRG_Cilium_RVxP_2.html
TRG_ENDOCYTIC_2 Tyrosine-based sorting signal responsible for the interaction with mu subunit of AP (Adaptor Protein) complex Y..[LMVIF] 15 http://elm.eu.org/elms/elmPages/TRG_ENDOCYTIC_2.html
TRG_ER_diArg_1 The di-Arg ER retention motif is defined by two consecutive arginine residues (RR) or with a single residue insertion (RXR). The motif is completed by an adjacent hydrophobic/arginine residue which may be on either side of the Arg pair. ([LIVMFYWPR]R[^YFWDE]{0,1}R)|(R[^YFWDE]{0,1}R[LIVMFYWPR]) 27 http://elm.eu.org/elms/elmPages/TRG_ER_diArg_1.html
TRG_ER_diLys_1 ER retention and retrieving signal found at the C-terminus of type I ER membrane proteins (cytoplasmic in this topology). Di-Lysine signal is responsible for COPI-mediated retrieval from post-ER compartments. K.{0,1}K.{2,3}$ 14 http://elm.eu.org/elms/elmPages/TRG_ER_diLys_1.html
TRG_ER_FFAT_1 VAP-A/Scs2 MSP-domain binding FFAT (diphenylalanine [FF] in an Acidic Tract) motif [DE].{0,4}E[FY][FYK]D[AC].[ESTD] 20 http://elm.eu.org/elms/elmPages/TRG_ER_FFAT_1.html
TRG_ER_KDEL_1 Golgi-to-ER retrieving signal found at the C-terminus of many ER soluble proteins. It interacts with the KDEL receptor which in turns interacts with components of the coatomer (COP I). [KRHQSAP][DENQT]EL$ 12 http://elm.eu.org/elms/elmPages/TRG_ER_KDEL_1.html
TRG_Golgi_diPhe_1 ER to Golgi anterograde transport signal found at the C-terminus of type I ER-CGN integral membrane cargo receptors (cytoplasmic in this topology), it binds to COPII. Q.{6,6}FF.{6,7}$ 11 http://elm.eu.org/elms/elmPages/TRG_Golgi_diPhe_1.html
TRG_LysEnd_APsAcLL_1 Sorting and internalisation signal found in the cytoplasmic juxta-membrane region of type I transmembrane proteins. Targets them from the Trans Golgi Network to the lysosomal-endosomal-melanosomal compartments. Interacts with adaptor protein (AP) complexes [DERQ]...L[LVI] 16 http://elm.eu.org/elms/elmPages/TRG_LysEnd_APsAcLL_1.html
TRG_LysEnd_APsAcLL_3 Sorting signal found in the cytoplasmic juxta-membrane region of type I transmembrane lysosomal, endosomal and melanosomal proteins. Based on experimental evidence and alignments, this very specific ELM represents the best combination for AP3 binding. [DET]E[RK].PL[LI] 3 http://elm.eu.org/elms/elmPages/TRG_LysEnd_APsAcLL_3.html
TRG_LysEnd_GGAAcLL_1 Sorting signal directing type I transmembrane proteins from the Trans Golgi Network (TGN) to the lysosomal-endosomal compartment. It is found near the C-terminus and interacts with the VHS domain of GGAs adaptor proteins. D..LL.{1,2}$ 6 http://elm.eu.org/elms/elmPages/TRG_LysEnd_GGAAcLL_1.html
TRG_LysEnd_GGAAcLL_2 Internal acidic di Leucine motif found in GGA 1 and 3. It binds to their VHS domains in an autoinhibitory manner. Cycles of phosphorylation-dephosphorylation of upstream Ser regulate the autoinhibitory binding and therefore the function of GGA 1/3. S[LW]LD[DE]EL[LM] 4 http://elm.eu.org/elms/elmPages/TRG_LysEnd_GGAAcLL_2.html
TRG_NES_CRM1_1 Some proteins re-exported from the nucleus contain a Leucine-rich nuclear export signal (NES) binding to the CRM1 exportin protein. ([DEQ].{0,1}[LIM].{2,3}[LIVMF][^P]{2,3}[LMVF].[LMIV].{0,3}[DE])|([DE].{0,1}[LIM].{2,3}[LIVMF][^P]{2,3}[LMVF].[LMIV].{0,3}[DEQ]) 18 http://elm.eu.org/elms/elmPages/TRG_NES_CRM1_1.html
TRG_NLS_Bipartite_1 Bipartite variant of the classical basically charged NLS. [KR][KR].{7,15}[^DE]((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))[^DE] 9 http://elm.eu.org/elms/elmPages/TRG_NLS_Bipartite_1.html
TRG_NLS_MonoCore_2 Monopartite variant of the classical basically charged NLS. Strong core version. [^DE]((K[RK])|(RK))[KRP][KR][^DE] 17 http://elm.eu.org/elms/elmPages/TRG_NLS_MonoCore_2.html
TRG_NLS_MonoExtC_3 Monopartite variant of the classical basically charged NLS. C-extended version. [^DE]((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))(([PKR])|([^DE][DE])) 18 http://elm.eu.org/elms/elmPages/TRG_NLS_MonoExtC_3.html
TRG_NLS_MonoExtN_4 Monopartite variant of the classical basically charged NLS. N-extended version. (([PKR].{0,1}[^DE])|([PKR]))((K[RK])|(RK))(([^DE][KR])|([KR][^DE]))[^DE] 26 http://elm.eu.org/elms/elmPages/TRG_NLS_MonoExtN_4.html
TRG_PEX_1 Wxxx[FY] motifs present in N-terminal half of Pex5 bind to Pex13 and Pex14 at peroxisomal and glycosomal membranes to facilitate entrance of PTS1 cargo proteins into the organellar lumen. W...[FY] 27 http://elm.eu.org/elms/elmPages/TRG_PEX_1.html
TRG_PEX_2 Fxxx[WF] motifs are present in Pex19 and S. cerevisiae Pex5 cytosolic receptors that bind to peroxisomal membrane docking member, Pex14 F...[WF] 2 http://elm.eu.org/elms/elmPages/TRG_PEX_2.html
TRG_PEX_3 LxxLLxxxLxxF motif is located in N-terminus of Pex19 receptors that are responsible for docking to Pex3 docking factor at cis side of peroxisomal membrane. L..LL...L..F 1 http://elm.eu.org/elms/elmPages/TRG_PEX_3.html
TRG_PTS1 Generic PTS1 ELM for all eukaryotes (.[SAPTC][KRH][LMFI]$)|([KRH][SAPTC][NTS][LMFI]$) 5 http://elm.eu.org/elms/elmPages/TRG_PTS1.html
TRG_PTS2 Generic PTS2 pattern for all eukaryotes (except lineages which have lost it) ^.{1,40}R[^P][^P][^P][LIV][^P][^P][HQ][LIF] 2 http://elm.eu.org/elms/elmPages/TRG_PTS2.html
CLV_PCSK_KEX2_1
CLV_PCSK_PC1ET2_1
CLV_PCSK_PC7_1
CLV_PCSK_SKI1_1
DEG_CRL4_CDT2_2
DEG_SCF_FBW7_2
DOC_AGCK_PIF_2
DOC_AGCK_PIF_3
LIG_14-3-3_2
LIG_14-3-3_3
LIG_Actin_WH2_1
LIG_APCC_Cbox_1
LIG_BIR_III_1
LIG_BIR_III_2
LIG_BIR_III_4
LIG_BRCT_BRCA1_2
LIG_BRCT_MDC1_1
LIG_EF_ALG2_ABM_2
LIG_eIF4E_2
LIG_KEPE_1
LIG_KEPE_3
LIG_LIR_Apic_2
LIG_LIR_LC3C_4
LIG_LIR_Nem_3
LIG_LYPXL_L_2
LIG_PTB_Phospho_1
LIG_WD40_WDR5_2
LIG_WD40_WDR5_WIN_2
LIG_WD40_WDR5_WIN_3
LIG_WRPW_2
MOD_PKA_2
TRG_Cilium_Arf4_1
TRG_LysEnd_APsAcLL_3
TRG_NLS_Bipartite_1
TRG_NLS_MonoExtC_3
TRG_NLS_MonoExtN_4
#Compiler and Linker
CC := g++
#The Target Binary Program
TARGET := main
#The Directories, Source, Includes, Objects, Binary and Resources
SRCDIR := src
INCDIR := inc
BUILDDIR := obj
TARGETDIR := bin
RESDIR := res
SRCEXT := cpp
DEPEXT := d
OBJEXT := o
#Flags, Libraries and Includes
CFLAGS := -fopenmp -Wall -O3 -g
LIB := -fopenmp -lm
INC := -I$(INCDIR) -I/usr/local/include
INCDEP := -I$(INCDIR)
#---------------------------------------------------------------------------------
#DO NOT EDIT BELOW THIS LINE
#---------------------------------------------------------------------------------
SOURCES := $(shell find $(SRCDIR) -type f -name *.$(SRCEXT))
OBJECTS := $(patsubst $(SRCDIR)/%,$(BUILDDIR)/%,$(SOURCES:.$(SRCEXT)=.$(OBJEXT)))
#Defauilt Make
all: resources $(TARGET)
#Remake
remake: cleaner all
#Make the Directories
directories:
@mkdir -p $(TARGETDIR)
@mkdir -p $(BUILDDIR)
#Clean only Objecst
clean:
@$(RM) -rf $(BUILDDIR)
#Full Clean, Objects and Binaries
cleaner: clean
@$(RM) -rf $(TARGETDIR)
#Pull in dependency info for *existing* .o files
-include $(OBJECTS:.$(OBJEXT)=.$(DEPEXT))
#Link
$(TARGET): $(OBJECTS)
$(CC) -o $(TARGETDIR)/$(TARGET) $^ $(LIB)
#Compile
$(BUILDDIR)/%.$(OBJEXT): $(SRCDIR)/%.$(SRCEXT)
@mkdir -p $(dir $@)
$(CC) $(CFLAGS) $(INC) -c -o $@ $<
@$(CC) $(CFLAGS) $(INCDEP) -MM $(SRCDIR)/$*.$(SRCEXT) > $(BUILDDIR)/$*.$(DEPEXT)
@cp -f $(BUILDDIR)/$*.$(DEPEXT) $(BUILDDIR)/$*.$(DEPEXT).tmp
@sed -e 's|.*:|$(BUILDDIR)/$*.$(OBJEXT):|' < $(BUILDDIR)/$*.$(DEPEXT).tmp > $(BUILDDIR)/$*.$(DEPEXT)
@sed -e 's/.*://' -e 's/\\$$//' < $(BUILDDIR)/$*.$(DEPEXT).tmp | fmt -1 | sed -e 's/^ *//' -e 's/$$/:/' >> $(BUILDDIR)/$*.$(DEPEXT)
@rm -f $(BUILDDIR)/$*.$(DEPEXT).tmp
#Non-File Targets
.PHONY: all remake clean cleaner resources
#ifndef _DATABASE_
#define _DATABASE_
#include "Header.hpp"
#include "Motif_class.hpp"
class Database
{
private:
std::string all = "VLTKWHFIRMAPGSCNQYDE";
std::string marked_motifs_filename = "";
std::string complement(std::string);
std::vector<Motif_class> motifs_eliminated;
bool clean_ends;
bool clean_marked;
/*
{a,b} always choose a, and for | always take the shortest one.
*/
std::vector<std::string> proc(std::string);
/*
INPUT: databse location
OUTPUT: vector of pairs with first coordinate the name of the linear motif, second coordinate regex.
*/
void load_motif_database(std::string);
/*
Eliminate end characters with more than 10 posibilities. If the list is empty we eliminate it while reporting which are eliminated.
*/
void clean_end_characters(void);
std::vector<Motif_class> motif_database;