H-InvDB_8.3 released on March 26, 2013.
Search by
Keyword
H-Inv ID (HIT)
H-Inv cluster ID (HIX)
H-Inv protein ID (HIP)
H-Inv gene family/group (HIF)
Accession number
Chromosome number
Chromosome band
Definition*
Data source ID
---
CCDS ID
dbSNP ID (rs number)
EC number
Ensembl ID
EntrezGene ID
FR ID
FR Accession number
GO ID
GO name*
HGNC gene symbol
HGNC gene name*
InterPro ID
InterPro name*
OMIM ID
OMIM title*
Pathway ID
Pathway name*
RefSeq (gene) ID
RefSeq (protein) ID
SCOP ID
UniProt
for
Advanced Search
Home
Quick guide
Navi
BLAST
Site map
Download
Contact us
Help
Locus view
Protein view
G-integra
DiseaseInfo Viewer
H-ANGEL
Evola
PPI view
Gene Family/Group
Hyperlink MS
H-Invitational ID:
HIT000000610
Accession number:
AB020700
Created date:
26-Mar-2013
Last modified:
20-Apr-2012
Definition:
WD repeat-containing protein 47 isoform 3.
Select format
Flat file
XML file
Nucleotide sequence fasta
Protein sequence fasta
Transcript original information
Accession number
AB020700.1
CAGE tag ID
NA
EST ID
NA
Clone Number
hk08702
Experimental resources
NBRC
;
HGPD
;
Antibody (WDR47)
;
Catalog (WDR47)
;
Sequence data provider
Provider:
KDRI
;
Annotation project
H-Invitational FLcDNA
Length of cDNA
4195[bp] (No. of exon:15)[A:1205 T:1232 G:922 C:836]
Devision
HUM
Molecular type
mRNA
Library origin
Cell type
NA
Tissue type
brain
Develpmental stage
adult
Sequence quality information
CDS feature
Complete CDS
Kozak sequence
NA
PolyA
NA
Vector/adapter sequence
NA
Frame shift
NA
Remaining intron
NA
Splice site acceptor (NAGNAG)
AAGAAG;
Transcript quality feature
NA
Notes
NA
GTCGCTGGGCCGGGAGGGGCGGACGTGAGAAGGACGGATTGACGAACTGA TGGATTGACGCGCGGGCGGTAGGAGGGAGGACCGACGCCAAACCCAGACC GCCGCCGTCGTGCTCCTGCCGCAGCCCGGAGCCGGCCGCTTCGGGGCCCT GGCCGCCGGCCTCCCAGCCGCGTTCTCCTCCGCCGCTCCTCCGGGCTTGC CCTGGAGCCCTCAGGCTATCAATATGACGGCTGAAGAAACAGTGAATGTA AAAGAGGTTGAAATCATTAAGCTAATTTTGGACTTCCTGAATTCAAAGAA GCTTCACATTAGTATGCTGGCCCTGGAGAAGGAAAGTGGAGTCATAAATG GCCTGTTTTCAGATGATATGCTTTTCCTGAGGCAGCTAATACTTGATGGT CAATGGGATGAAGTTCTTCAGTTCATTCAGCCTCTAGAATGTATGGAAAA ATTTGACAAAAAAAGGTTTCGTTATATTATCCTGAAGCAGAAGTTTTTAG AAGCTTTATGTGTTAACAACGCGATGTCAGCAGAAGATGAGCCCCAGCAT CTGGAATTTACCATGCAAGAAGCTGTGCAATGTTTACATGCTCTAGAAGA ATACTGTCCTTCTAAAGATGACTATAGTAAGCTCTGTTTGCTTTTGACTT TGCCTCGTCTGACCAATCATGCCGAGTTTAAGGACTGGAATCCCAGCACC GCACGAGTTCACTGTTTTGAAGAGGCTTGTGTCATGGTTGCAGAATTCAT CCCTGCTGATAGGAAGCTAAGTGAAGCTGGTTTTAAGGCTAGTAACAATC GTTTATTTCAGCTTGTAATGAAAGGCCTGCTTTATGAATGCTGTGTAGAA TTTTGTCAGAGTAAAGCAACTGGAGAAGAAATTACAGAAAGCGAAGTGCT TCTTGGCATCGACCTCTTATGTGGTAATGGTTGTGATGATTTGGATCTGA GTTTACTGTCATGGCTTCAGAATCTTCCATCTTCTGTCTTCTCTTGTGCT TTTGAACAGAAAATGCTTAATATTCATGTTGACAAACTTCTGAAACCTAC AAAAGCTGCATATGCTGATCTTTTGACTCCTCTTATCAGCAAACTCTCTC CCTATCCATCATCCCCAATGAGAAGACCTCAATCAGCTGATGCCTATATG ACCCGCTCTCTGAATCCTGCTTTAGATGGCCTCACCTGTGGACTAACCAG TCATGATAAGAGAATTTCAGACCTTGGAAACAAAACTTCTCCAATGTCAC ACTCCTTTGCTAACTTCCATTATCCAGGGGTACAAAACCTCAGTAGAAGT CTCATGCTTGAGAATACAGAATGTCACAGTATTTACGAAGAATCCCCTGA GCGTGATACACCTGTTGATGCACAGAGGCCTATCGGCAGTGAAATCTTGG GCCAGAGTTCAGTTTCAGAAAAAGAGCCTGCAAATGGAGCACAGAATCCA GGACCAGCTAAACAAGAAAAAAATGAGCTTCGAGATTCAACAGAACAATT TCAAGAATATTATAGGCAAAGATTACGCTATCAACAGCATTTAGAACAGA AGGAGCAACAGCGGCAGATATACCAACAGATGTTGCTTGAAGGAGGCGTG AATCAGGAGGATGGTCCTGATCAGCAGCAGAATCTTACTGAACAGTTCCT TAATAGGTCCATTCAAAAGCTTGGTGAATTAAATATTGGAATGGATGGCC TTGGTAATGAGGTATCAGCACTCAACCAGCAATGTAATGGGAGCAAAGGC AATGGATCTAATGGTTCTTCTGTGACTAGTTTTACTACACCACCCCAAGA CTCTAGTCAGAGATTAACACATGATGCTTCAAATATTCATACAAGCACTC CTCGTAATCCTGGATCAACAAATCACATACCTTTTCTGGAGGAATCACCT TGTGGAAGCCAAATCTCTTCAGAACATTCGGTCATTAAGCCACCTCTTGG AGATTCTCCAGGGAGTCTTTCAAGGTCGAAAGGGGAAGAGGATGACAAAT CAAAAAAGCAGTTTGTTTGTATTAATATCCTAGAAGACACACAAGCTGTT AGAGCAGTGGCTTTTCATCCAGCTGGAGGTTTATATGCTGTTGGTTCAAA TTCAAAAACTCTGAGAGTATGTGCCTATCCAGATGTAATTGATCCAAGTG CACATGAGACTCCTAAGCAGCCGGTGGTACGTTTTAAAAGGAATAAACAT CATAAAGGATCCATTTACTGTGTGGCCTGGAGTCCTTGTGGGCAGTTATT AGCAACAGGATCAAATGACAAATACGTCAAAGTGCTGCCCTTCAATGCAG AGACTTGTAACGCAACAGGACCAGATCTGGAATTTAGTATGCATGATGGA ACAATTAGAGACTTGGCATTTATGGAAGGCCCAGAAAGCGGAGGAGCTAT TTTAATAAGTGCTGGAGCAGGGGATTGTAACATTTATACAACCGATTGTC AAAGAGGTCAGGGCCTCCATGCTTTGAGTGGACATACTGGGCATATTTTA GCACTTTATACCTGGAGTGGCTGGATGATTGCATCTGGTTCCCAAGATAA GACTGTTAGATTTTGGGATCTTCGAGTACCAAGTTGTGTTCGTGTTGTTG GCACAACATTTCATGGAACTGGCAGTGCAGTGGCATCTGTAGCTGTAGAT CCCAGTGGTCGTCTCTTAGCCACAGGTCAAGAAGATTCTAGCTGCATGTT GTATGACATAAGAGGAGGAAGAATGGTACAAAGTTATCATCCTCATTCCA GTGATGTTCGCTCTGTTCGATTCTCCCCTGGAGCTCACTACTTGCTAACA GGCTCTTATGATATGAAAATAAAGGTGACAGACCTACAAGGGGACCTCAC CAAGCAGCTTCCTATCATGGTGGTGGGGGAGCACAAGGACAAAGTGATTC AGTGCAGATGGCACACCCAGGATCTTTCCTTCCTGTCATCCTCTGCAGAT AGAACTGTCACCCTCTGGACTTACAATGGGTAGAGCACACCGCATGTCAG TCTATGCAGCAAAAGCACAGAGACTTAAGACTACTGAGTTGTGAAAATTA CAAATCTGAAGAACATAGTGTCCAGGAAAGTGGTTTAGCACGAAGAGGCC CCTTATTACCATGTATCCCACTGATAGGAGGTGTTGGGTGGTGTTATTCC GCAGTGCTTTCAGTCTTCCATGTGAGCTCGTGCTGCTGTGACCTGCTATA TGTAGTCTCGTTGCCAAAGTCTGCAGAAGAGCTCTTCAGTTGTTGGTGTG CACTCCAAGTCAGGATGGACAATGTGTTTACGGTTTAGTATTCAATGCAT TCCTTGGTCTTTGCCTAAATAACAGTTTTATATGCACATTGAAATGGAAT TATACTTCAACTATATTATTAAATGTAATGCAACCAAGTTCCTCCCAGAT TAAACTTCCCAGGTGTTCAGAATTACTTTTGCTCTTCTCACGATCCCATA TTGTATTATCACTTGTCTTCTAGAGGTCAGAATTCCATAATATATGTCAC TCAAAAGTTACATGGTTGCTTTCACTTAAGGATCATTATGGAGTTTAAAG ATGAATGAAAAACTGCTTCTTAGTTTACTACATGGTATAGGCCCTTTTTT CTTAAACCCAGGGATATGATTATTTTGTCATATAATTTTGTTTCAGGCTA AAAGGTAAATGTGTTTGCTTCAGAAACTTGTTAACTTCAGTTTTTTGAAT GCAACAGGATACCTCCCTTCCAAACTGAACTGTAGAAGCAGAGCAGCAGC AGTTATGTGATGCAACACTTGATGGTACAGTAAATTTACTGGCATTTTTC TCCTTAAAAATTAAAATCCTTGACATAGACCATAGCATGGCTTGAAATGC TATGTCTGCATGATAATTTAAAATGGAAGATTTAAACTTTGCACTCCAAA AGCTTATTTGGATTTTTTTCTTGCACTGTTTTGTGTAATGCAGAATAATG ATTTTATTTCTACAGCTTTGTAGATTCTAACATTTATGTATCTTTATTTT CATATTGTACAGTAATTTTACTTTAAATTATTTAAATAGGCTATTTTATT TATTTCAAATGCAGTTGTATTAGTTCTCATTATTGAACTGTCTGTGCACT GTATGTAGCAAGCATTTTTCATCTGTTGTATACAAGTGGAAAGGGTATTA GAAGTGTAACTGTGCTATTATTTCAATAAAGACCTCTTGACATTT
Gene structure information
H-Inv cluster ID
HIX0199788
Genomic location
Chromosome
1
Location
NA
Position
109512840- 109584697
Strand
-
Possible duplicated location(s)
NA
Gene structure
15 exon(s)
Database links
RefSeq
NM_014969
;
NM_001142550
;
NM_001142551
;
Ensembl
ENST00000357672
;
ENST00000361054
;
ENST00000369962
;
ENST00000369965
;
ENST00000400794
;
ENST00000528747
;
ENST00000529074
;
ENST00000530772
;
ENST00000531337
;
Entrez Gene
Entrez Gene ID:22911
;
KEGG GENES
KEGG GENES(22911)
;
GeneCard
WDR47
;
*GeneCards is provided free to academic non-profit institutions.
Related H-InvDB links
H-DBAS
;
G-integra
;
cDNA-genome alignment
;
Predicted CDS information
HIP ID
HIP000105298
Predicted CDS
224..2983; 919[aa]; Orientation:+2;
Codon Adaptation Index (CAI).
0.697
Database links
RefSeq
NP_001136023
;
UniProt
O94967
;
CCDS
CCDS44187
;
MTAEETVNVKEVEIIKLILDFLNSKKLHISMLALEKESGVINGLFSDDML FLRQLILDGQWDEVLQFIQPLECMEKFDKKRFRYIILKQKFLEALCVNNA MSAEDEPQHLEFTMQEAVQCLHALEEYCPSKDDYSKLCLLLTLPRLTNHA EFKDWNPSTARVHCFEEACVMVAEFIPADRKLSEAGFKASNNRLFQLVMK GLLYECCVEFCQSKATGEEITESEVLLGIDLLCGNGCDDLDLSLLSWLQN LPSSVFSCAFEQKMLNIHVDKLLKPTKAAYADLLTPLISKLSPYPSSPMR RPQSADAYMTRSLNPALDGLTCGLTSHDKRISDLGNKTSPMSHSFANFHY PGVQNLSRSLMLENTECHSIYEESPERDTPVDAQRPIGSEILGQSSVSEK EPANGAQNPGPAKQEKNELRDSTEQFQEYYRQRLRYQQHLEQKEQQRQIY QQMLLEGGVNQEDGPDQQQNLTEQFLNRSIQKLGELNIGMDGLGNEVSAL NQQCNGSKGNGSNGSSVTSFTTPPQDSSQRLTHDASNIHTSTPRNPGSTN HIPFLEESPCGSQISSEHSVIKPPLGDSPGSLSRSKGEEDDKSKKQFVCI NILEDTQAVRAVAFHPAGGLYAVGSNSKTLRVCAYPDVIDPSAHETPKQP VVRFKRNKHHKGSIYCVAWSPCGQLLATGSNDKYVKVLPFNAETCNATGP DLEFSMHDGTIRDLAFMEGPESGGAILISAGAGDCNIYTTDCQRGQGLHA LSGHTGHILALYTWSGWMIASGSQDKTVRFWDLRVPSCVRVVGTTFHGTG SAVASVAVDPSGRLLATGQEDSSCMLYDIRGGRMVQSYHPHSSDVRSVRF SPGAHYLLTGSYDMKIKVTDLQGDLTKQLPIMVVGEHKDKVIQCRWHTQD LSFLSSSADRTVTLWTYNG*
Motif information
a.a.
length
InterPro
Name
33
IPR006594
LisH dimerisation motif [Domain]
58
IPR006595
CTLH, C-terminal LisH motif [Domain]
39
IPR001680
WD40 repeat [Repeat]
315
IPR015943
WD40/YVTN repeat-like-containing domain [Domain]
27
IPR019781
WD40 repeat, subgroup [Repeat]
311
IPR011046
WD40 repeat-like-containing domain [Domain]
43
IPR001680
WD40 repeat [Repeat]
31
IPR019782
WD40 repeat 2 [Repeat]
263
IPR017986
WD40-repeat-containing domain [Domain]
30
IPR019781
WD40 repeat, subgroup [Repeat]
45
IPR001680
WD40 repeat [Repeat]
39
IPR001680
WD40 repeat [Repeat]
37
IPR019781
WD40 repeat, subgroup [Repeat]
41
IPR019782
WD40 repeat 2 [Repeat]
15
IPR019775
WD40 repeat, conserved site [Conserved_site]
40
IPR001680
WD40 repeat [Repeat]
35
IPR019781
WD40 repeat, subgroup [Repeat]
42
IPR019782
WD40 repeat 2 [Repeat]
40
IPR001680
WD40 repeat [Repeat]
34
IPR019781
WD40 repeat, subgroup [Repeat]
35
IPR019782
WD40 repeat 2 [Repeat]
40
IPR001680
WD40 repeat [Repeat]
35
IPR019781
WD40 repeat, subgroup [Repeat]
36
IPR019782
WD40 repeat 2 [Repeat]
Gene function information
H-Inv ID
HIT000000610
H-Inv cluster ID
HIX0199788
Accession number
AB020700.1
CAGE tag ID
NA
EST ID
NA
Transcript feature
Representative transcript;
Splicing isoform
Coding potential
Protein coding;
Definition
WD repeat-containing protein 47 isoform 3.
Similarity category
Category: Identical to known human protein(Category I).
Identical to known human protein (
NP_001136023
) [Identity/coverage = 100.0%/100.0%] to Homo sapiens protein.
Experimental evidence
Protein evidence
PubMed ID
NA
Gene family/group
H-Inv gene family/group ID
NA
Gene family/group name
NA
Evidence motif (InterPro) ID
NA
Gene symbol/name
HGNC symbol
WDR47
HGNC aliases
NA
HGNC name
WD repeat domain 47
DDBJ
KIAA0893
UniProt
NA
EC number
NA
GGDB
(GlycoGene Database)
Gene symbol
NA
Familly
NA
Designation
NA
Expression
NA
KEGG metabolic pathway
NA
Protein-protein interaction (PPI)
H-Inv protein ID
HIP000105298
No. of interaction
12
Interaction partner(s)
HIP000030585
;
HIP000033190
;
HIP000039383
;
HIP000039653
;
HIP000060275
;
HIP000071861
;
HIP000076344
;
HIP000083959
;
HIP000096070
;
HIP000100267
;
HIP000192629
;
HIP000333923
;
BIND
NA
DIP
NA
MINT
MINT-6488791; MINT-7945693;
HPRD
01475; 03590; 04155; 05923; 07411; 09732; 11822; 11906; 11907; 11909; 11912; 13626; 16059;
IntAct
EBI-2880601; EBI-2880211; EBI-1237540; EBI-3962281;
Database links
RefSeq
NM_014969
;
NM_001142550
;
NM_001142551
;
Ensembl
ENST00000357672
;
ENST00000361054
;
ENST00000369962
;
ENST00000369965
;
ENST00000400794
;
ENST00000528747
;
ENST00000529074
;
ENST00000530772
;
ENST00000531337
;
Entrez Gene
Entrez Gene ID:22911
;
KEGG GENES
KEGG GENES(22911)
;
GeneCard
WDR47
;
*GeneCards is provided free to academic non-profit institutions.
etc
Human-Gene diversity Of Life-style related Diseases
;
Curation status
Human curated
Notes
NA
Related H-InvDB links
Gene family;
Similarity Search Tool
;
TACT
;
fRNAdb (The functional RNA database) : overlapping fRNAdb entries with H-InvDB transcripts based on the location of the genome.
NA
Subcellular localization information
Last modified:20-Apr-2012
WoLF PSORT
nuclear; cytosol;
Target P
Other
SOSUI
soluble protein
TMHMM
soluble protein
PTS1
Not targeted
Related H-InvDB links
LIFEdb;
JRE-1.4.0 or later is required.
Download JRE at
Sun's web site.
Protein structure information (GTOP)
Last modified:20-Apr-2012
Start
End
PDB_ID
E-value
Identity
Coverage
SCOP_ID
13
76
1uujA
3e-11
19.7
61/76
a.221.1.1
594
915
1nr0A1
4e-29
16.2
296/311
b.69.4.1
Related H-InvDB links
GTOP
Gene expression information
Last modified:20-Apr-2012
Tissue-specific expression
NA
Probe
information
AceGene
AGhsA240912;
Affymetrix
GeneChip
HG-Focus
NA
HG-U133
203855_at;
HG-U133A
203855_at;
HG-U133A_2
203855_at;
HG-U133B
NA
HG-U133_Plus_2
203855_at;
HG-U95
35720_at;
HG-U95A
35720_at;
HG-U95B
NA
HG-U95C
NA
HG-U95D
NA
HG-U95E
NA
HG-U95Av2
NA
HuEx-1_0
2350464; 2426841; 2426842; 2426843; 2426844; 2426846; 2426848; 2426849; 2426850; 2426852; 2426854; 2426857; 2426858; 2426859; 2426860; 2426861; 2426862; 2426863; 2426865; 4052652; 4052654; 4052678; 4052680; 4052682; 4054126; 4054127; 4054132; 4054133;
HuGeneFL
NA
Agilent
Human 1A Oligo Microarray:PGID215
A_23_P23748;
Whole Human Genome Oligo Microarray:PGID247
A_23_P23748;
Related H-InvDB links
H-ANGEL
;
DNAProbeLocator
;
Disease/pathology information
Last modified:20-Apr-2012
Disease relation
Disease name:NA
Related information in OMIM
OMIM ID:NA Title:NA
Co-localized orphan diseases
OMIM ID:
115665
;
116600
;
155600
;
600975
;
605225
;
605606
;
606788
;
606852
;
606928
;
607317
;
607671
;
608543
;
608553
;
608995
;
610320
;
612367
;
612596
;
Disease related mutation
NA
Literature-Extracted GENe-Disease Associations (LEGENDA)
Gene name
Entrez Gene ID:(22911)
Disease
Entrez Gene ID:(22911)
Substance
Entrez Gene ID:(22911)
Related H-InvDB links
DiseaseInfo Viewer
;
LEGENDA
;
Evolutionary information
Last modified:20-Apr-2012
Relationship
Species
Accession number
MGI
Links
Orthology
Mus sp.
(Mouse)
BC040337
MGI:2139593
G-integra
Orthology
Macaca sp.
(Macaque)
ENSMMUT00000012951
G-integra
Orthology
Oryzias sp.
(Medaka)
ENSORLT00000008618
G-integra
Orthology
Oryzias sp.
(Medaka)
ENSORLT00000019622
G-integra
Orthology
Pongo sp.
(Orangutan)
ENSPPYT00000001301
G-integra
Orthology
Pan sp.
(Chimpanzee)
ENSPTRT00000001956
G-integra
Orthology
Tetraodon sp.
(Tetraodon)
GSTENT00028473001
G-integra
Orthology
Takifugu sp.
(Fugu)
SINFRUT00000160446
G-integra
Orthology
Takifugu sp.
(Fugu)
SINFRUT00000166078
G-integra
Orthology
Bos sp.
(Cow)
XM_001253082
G-integra
Orthology
Monodelphis sp.
(Opossum)
XM_001381894
G-integra
Orthology
Equus sp.
(Horse)
XM_001493773
G-integra
Orthology
Danio sp.
(Zebrafish)
XM_001922584
G-integra
Orthology
Gallus sp.
(Chicken)
XM_422187
G-integra
Orthology
Canis sp.
(Dog)
XM_547247
G-integra
Phylogenetic tree [View by
ATV]
Neighbor-joining (phb)
Related H-InvDB links
Evola
;
d
N
/
d
S
(under constraction);
Polymorphism (SNP, indel), microsatellite (Short Tandem Repeat, STR) and repeat information
Single Nucleotide Polymorphism (SNP) and indel
Location
Variation
dbSNP ID
Strand
CDS/UTR
Translation
422 .. 422
T/C
rs112410635
-
CDS
Nonsynonymous[Phe67Leu]
482 .. 482
C/T
rs55730953
-
CDS
Synonymous[Leu87Leu]
1392 .. 1392
A/G
rs79997809
-
CDS
Nonsynonymous[Glu390Gly]
1827 .. 1827
C/T
rs113434051
-
CDS
Nonsynonymous[Ala535Val]
2080 .. 2080
T/C
rs41299563
-
CDS
Synonymous[Gly619Gly]
2429 .. 2429
A/C
rs1538137
-
CDS
Nonsynonymous[Asn736His]
2443 .. 2443
C/T
rs76370963
-
CDS
Synonymous[Thr740Thr]
2481 .. 2481
G/T
rs74576781
-
CDS
Nonsynonymous[Gly753Val]
2778 ^ 2779
-/C
rs35988108
-
CDS
3345 .. 3345
T/G
rs112609892
-
3'UTR
3538 .. 3538
A/G
rs507776
-
3'UTR
3756 .. 3756
T/A
rs114053618
-
3'UTR
3761 .. 3761
T/C
rs11803800
-
3'UTR
4181 .. 4181
G/C
rs12068536
-
3'UTR
Microsatellite (Short Tandem Repeat, STR)
No data available
Microsatellite: Human-Gene diversity Of Life-style related Diseases (H-GOLD)
No data available
Repeat
No data available
Database links
Human-Gene diversity Of Life-style related Diseases(H-GOLD)
;
Related H-InvDB links
VaryGene
;
Repeat Mask Viewer
;