H-InvDB_8.3 released on March 26, 2013.
Search by
Keyword
H-Inv ID (HIT)
H-Inv cluster ID (HIX)
H-Inv protein ID (HIP)
H-Inv gene family/group (HIF)
Accession number
Chromosome number
Chromosome band
Definition*
Data source ID
---
CCDS ID
dbSNP ID (rs number)
EC number
Ensembl ID
EntrezGene ID
FR ID
FR Accession number
GO ID
GO name*
HGNC gene symbol
HGNC gene name*
InterPro ID
InterPro name*
OMIM ID
OMIM title*
Pathway ID
Pathway name*
RefSeq (gene) ID
RefSeq (protein) ID
SCOP ID
UniProt
for
Advanced Search
Home
Quick guide
Navi
BLAST
Site map
Download
Contact us
Help
Locus view
Protein view
G-integra
DiseaseInfo Viewer
H-ANGEL
Evola
PPI view
Gene Family/Group
Hyperlink MS
H-Invitational ID:
HIT000384911
Accession number:
AK225781
Created date:
26-Mar-2013
Last modified:
20-Apr-2012
Definition:
WD repeat-containing protein 47 isoform 1.
Select format
Flat file
XML file
Nucleotide sequence fasta
Protein sequence fasta
Transcript original information
Accession number
AK225781.1
CAGE tag ID
NA
EST ID
NA
Clone Number
FCC132C06
Experimental resources
NBRC
;
HGPD
;
Antibody (WDR47)
;
Catalog (WDR47)
;
Sequence data provider
NA
Annotation project
NA
Length of cDNA
4154[bp] (No. of exon:15)[A:1220 T:1230 G:884 C:820]
Devision
HUM
Molecular type
mRNA
Library origin
Cell type
NA
Tissue type
brain
Develpmental stage
NA
Sequence quality information
CDS feature
N-truncated
Kozak sequence
NA
PolyA
Site: 4133(+) Signal: 4112-4116(+)
Vector/adapter sequence
NA
Frame shift
NA
Remaining intron
NA
Splice site acceptor (NAGNAG)
AAGAAG;
Transcript quality feature
NA
Notes
NA
AAAACCCAGACCGCCGCCGTCGTGCTCCTGCCGCAGCCCGGAGCCGGCCG CTTCGGGGCCCTGGCCGCCGGCCTCCCAGCCGCGTTCTCCTCCGCCGCTC CTCCGGGCTTGCCCTGGAGCCCTCAGGCTATCAATATGACGGCTGAAGAA ACAGTGAATGTAAAAGAGGTTGAAATCATTAAGCTAATTTTGGACTTCCT GAATTCAAAGAAGCTTCACATTAGTATGCTGGCCCTGGAGAAGGAAAGTG GAGTCATAAATGGCCTGTTTTCAGATGATATGCTTTTCCTGAGGCAGCTA ATACTTGATGGTCAATGGGATGAAGTTCTTCAGTTCATTCAGCCTCTAGA ATGTATGGAAAAATTTGACAAAAAAAGGTTTCGTTATATTATCCTGAAGC AGAAGTTTTTAGAAGCTTTATGTGTTAACAACGCGATGTCAGCAGAAGAT GAGCCCCAGCATGTAAGATTTTTATTCCTGAAGCTGGAATTTACCATGCA AGAAGCTGTGCAATGTTTACATGCTCTAGAAGAATACTGTCCTTCTAAAG ATGACTATAGTAAGCTCTGTTTGCTTTTGACTTTGCCTCGTCTGACCAAT CATGCCGAGTTTAAGGACTGGAATCCCAGCACCGCACGAGTTCACTGTTT TGAAGAGGCTTGTGTCATGGTTGCAGAATTCATCCCTGCTGATAGGAAGC TAAGTGAAGCTGGTTTTAAGGCTAGTAACAATCGTTTATTTCAGCTTGTA ATGAAAGGCCTGCTTTATGAATGCTGTGTAGAATTTTGTCAGAGTAAAGC AACTGGAGAAGAAATTACAGAAAGCGAAGTGCTTCTTGGCATCGACCTCT TATGTGGTAATGGTTGTGATGATTTGGATCTGAGTTTACTGTCATGGCTT CAGAATCTTCCATCTTCTGTCTTCTCTTGTGCTTTTGAACAGAAAATGCT TAATATTCATGTTGACAAACTTCTGAAACCTACAAAAGCTGCATATGCTG ATCTTTTGACTCCTCTTATCAGCAAACTCTCTCCCTATCCATCATCCCCA ATGAGAAGACCTCAATCAGCTGATGCCTATATGACCCGCTCTCTGAATCC TGCTTTAGATGGCCTCACCTGTGGACTAACCAGTCATGATAAGAGAATTT CAGACCTTGGAAACAAAACTTCTCCAATGTCACACTCCTTTGCTAACTTC CATTATCCAGGGGTACAAAACCTCAGTAGAAGTCTCATGCTTGAGAATAC AGAATGTCACAGTATTTACGAAGAATCCCCTGAGCGAAGTGATACACCTG TTGATGCACAGAGGCCTATCGGCAGTGAAATCTTGGGCCAGAGTTCAGTT TCAGAAAAAGAGCCTGCAAATGGAGCACAGAATCCAGGACCAGCTAAACA AGAAAAAAAATGAGCTTCGAGATTCAACAGAACAATTTCAAGAATATTAT AGGCAAAGATTACGCTATCAACAGCATTTAGAACAGAAGGAGCAACAGCG GCAGATATACCAACAGATGTTGCTTGAAGGAGGCGTGAATCAGGAGGATG GTCCTGATCAGCAGCAGAATCTTACTGAACAGTTCCTTAATAGGTCCATT CAAAAGCTTGGTGAATTAAATATTGGAATGGATGGCCTTGGTAATGAGGT ATCAGCACTCAACCAGCAATGTAATGGGAGCAAAGGCAATGGATCTAATG GTTCTTCTGTGACTAGTTTTACTACACCACCCCAAGACTCTAGTCAGAGA TTAACACATGATGCTTCAAATATTCATACAAGCACTCCTCGTAATCCTGG ATCAACAAATCACATACCTTTTCTGGAGGAATCACCTTGTGGAAGCCAAA TCTCTTCAGAACATTCGGTCATTAAGCCACCTCTTGGAGATTCTCCAGGG AGTCTTTCAAGGTCGAAAGGGGAAGAGGATGACAAATCAAAAAAGCAGTT TGTTTGTATTAATATCCTAGAAGACACACAAGCTGTTAGAGCAGTGGCTT TTCATCCAGCTGGAGGTTTATATGCTGTTGGTTCAAATTCAAAAACTCTG AGAGTATGTGCCTATCCAGATGTAATTGATCCAAGTGCACATGAGACTCC TAAGCAGCCGGTGGTACGTTTTAAAAGGAATAAACATCATAAAGGATCCA TTTACTGTGTGGCCTGGAGTCCTTGTGGGCAGTTATTAGCAACAGGATCA AATGACAAATACGTCAAAGTGCTGCCCTTCAATGCAGAGACTTGTAACGC AACAGGACCAGATCTGGAATTTAGTATGCATGATGGAACAATTAGAGACT TGGCATTTATGGAAGGCCCAGAAAGCGGAGGAGCTATTTTAATAAGTGCT GGAGCAGGGGATTGTAACATTTATACAACCGATTGTCAAAGAGGACAGGG CCTCCATGCTTTGAGTGGACATACTGGGCATATTTTAGCACTTTATACCT GGAGTGGCTGGATGATTGCATCTGGTTCCCAAGATAAGACTGTTAGATTT TGGGATCTTCGAGTACCAAGTTGTGTTCGTGTTGTTGGCACAACATTTCA TGGAACTGGCAGTGCAGTGGCATCTGTAGCTGTAGATCCCAGTGGTCGTC TCTTAGCCACAGGTCAAGAAGATTCTAGCTGCATGTTGTATGACATAAGA GGAGGAAGAATGGTACAAAGTTATCATCCTCATTCCAGTGATGTTCGCTC TGTTCGATTCTCCCCTGGAGCTCACTACTTGCTAACAGGCTCTTATGATA TGAAAATAAAGGTGACAGACCTACAAGGGGACCTCACCAAGCAGCTTCCT ATCATGGTGGTGGGGGAGCACAAGGACAAAGTGATTCAGTGCAGATGGCA CACCCAGGATCTTTCCTTCCTGTCATCCTCTGCAGATAGAACTGTCACCC TCTGGACTTACAATGGGTAGAGCACACCGCATGTCAGTCTATGCAGCAAA AGCACAGAGACTTAAGACTACTGAGTTGTGAAAATTACAAATCTGAAGAA CATAGTGTCCAGGAAAGTGGTTTAGCACGAAGAGGCCCCTTATTACCATG TATCCCACTGATAGGAGGTGTTGGGTGGTGTTATTCCGCAGTGCTTTCAG TCTTCCATGTGAGCTCGTGCTGCTGTGACCTGCTATATGTAGTCTCGTTG CCAAAGTCTGCAGAAGAGCTCTTCAGTTGTTGGTGTGCACTCCAAGTCAG GATGGACAATGTGTTTACGGTTTAGTATTCAATGCATTCCTTGGTCTTTG CCTAAATAACAGTTTTATATGCACATTGAAATGGAATTATACTTCAACTA TATTATTAAATGTAATGCAACCAAGTTCCTCCCAGATTAAACTTCCCAGG TGTTCAGAATTACTTTTGCTCTTCTCACGATCCCATATTGTATTATCACT TGTCTTCTAGAGGTCAGAATTCCATAATATATGTCACTCAAAAGTTACAT GGTTGCTTTCACTTAAGGATCATTATGGAGTTTAAAGATGAATGAAAAAC TGCTTCTTAGTTTACTACATGGTATAGGCCCTTTTTTCTTAAACCCAGGG ATATGATTATTTTGTCATATAATTTTGTTTCAGGCTAAAAGGTAAATGTG TTTGCTTCAGAAACTTGTTAACTTCAGTTTTTTGAATGCAACAGGATACC TCCCTTCCAAACTGAACTGTAGAAGCAGAGCAGCAGCAGTTATGTGATGC AACACTTGATGGTACAGTAAATTTACTGGCATTTTTCTCCTTAAAAATTA AAATCCTTGACATAGACCATAGCATGGCTTGAAATGCTATGTCTGCATGA TAATTTAAAATGGAAGATTTAAACTTTGCACTCCAAAAGCTTATTTGGAT TTTTTTCTTGCACTGTTTTGTGTAATGCAGAATAATGATTTTATTTCTAC AGCTTTGTAGATTCTAACATTTATGTATCTTTATTTTCATATTGTACAGT AATTTTACTTTAAATTATTTAAATAGGCTATTTTATTTATTTCAAATGCA GTTGTATTAGTTCTCATTATTGAACTGTCTGTGCACTGTATGTAGCAAGC ATTTTTCATCTGTTGTATACAAGTGGAAAGGGTATTAGAAGTGTAACTGT GCTATTATTTCAATAAAGACCTCTTGACATTTAAAAAAAAAAAAAAAAAA AAAA
Gene structure information
H-Inv cluster ID
HIX0199788
Genomic location
Chromosome
1
Location
NA
Position
109512839- 109584608
Strand
-
Possible duplicated location(s)
NA
Gene structure
15 exon(s)
Database links
RefSeq
NM_014969
;
NM_001142550
;
NM_001142551
;
Ensembl
ENST00000357672
;
ENST00000361054
;
ENST00000369962
;
ENST00000369965
;
ENST00000400794
;
ENST00000528747
;
ENST00000529074
;
ENST00000530772
;
ENST00000531337
;
Entrez Gene
Entrez Gene ID:22911
;
KEGG GENES
KEGG GENES(22911)
;
GeneCard
WDR47
;
*GeneCards is provided free to academic non-profit institutions.
Related H-InvDB links
H-DBAS
;
G-integra
;
cDNA-genome alignment
;
Predicted CDS information
HIP ID
HIP000353499
Predicted CDS
1130..2920; 596[aa]; Orientation:+2;
Codon Adaptation Index (CAI).
0.697
PVMIREFQTLETKLLQCHTPLLTSIIQGYKTSVEVSCLRIQNVTVFTKNP LSEVIHLLMHRGLSAVKSWARVQFQKKSLQMEHRIQDQLNKKKNELRDST EQFQEYYRQRLRYQQHLEQKEQQRQIYQQMLLEGGVNQEDGPDQQQNLTE QFLNRSIQKLGELNIGMDGLGNEVSALNQQCNGSKGNGSNGSSVTSFTTP PQDSSQRLTHDASNIHTSTPRNPGSTNHIPFLEESPCGSQISSEHSVIKP PLGDSPGSLSRSKGEEDDKSKKQFVCINILEDTQAVRAVAFHPAGGLYAV GSNSKTLRVCAYPDVIDPSAHETPKQPVVRFKRNKHHKGSIYCVAWSPCG QLLATGSNDKYVKVLPFNAETCNATGPDLEFSMHDGTIRDLAFMEGPESG GAILISAGAGDCNIYTTDCQRGQGLHALSGHTGHILALYTWSGWMIASGS QDKTVRFWDLRVPSCVRVVGTTFHGTGSAVASVAVDPSGRLLATGQEDSS CMLYDIRGGRMVQSYHPHSSDVRSVRFSPGAHYLLTGSYDMKIKVTDLQG DLTKQLPIMVVGEHKDKVIQCRWHTQDLSFLSSSADRTVTLWTYNG*
Motif information
a.a.
length
InterPro
Name
39
IPR001680
WD40 repeat [Repeat]
315
IPR015943
WD40/YVTN repeat-like-containing domain [Domain]
27
IPR019781
WD40 repeat, subgroup [Repeat]
311
IPR011046
WD40 repeat-like-containing domain [Domain]
43
IPR001680
WD40 repeat [Repeat]
31
IPR019782
WD40 repeat 2 [Repeat]
263
IPR017986
WD40-repeat-containing domain [Domain]
30
IPR019781
WD40 repeat, subgroup [Repeat]
45
IPR001680
WD40 repeat [Repeat]
39
IPR001680
WD40 repeat [Repeat]
37
IPR019781
WD40 repeat, subgroup [Repeat]
41
IPR019782
WD40 repeat 2 [Repeat]
15
IPR019775
WD40 repeat, conserved site [Conserved_site]
40
IPR001680
WD40 repeat [Repeat]
35
IPR019781
WD40 repeat, subgroup [Repeat]
42
IPR019782
WD40 repeat 2 [Repeat]
40
IPR001680
WD40 repeat [Repeat]
34
IPR019781
WD40 repeat, subgroup [Repeat]
35
IPR019782
WD40 repeat 2 [Repeat]
40
IPR001680
WD40 repeat [Repeat]
35
IPR019781
WD40 repeat, subgroup [Repeat]
36
IPR019782
WD40 repeat 2 [Repeat]
Gene function information
H-Inv ID
HIT000384911
H-Inv cluster ID
HIX0199788
Accession number
AK225781.1
CAGE tag ID
NA
EST ID
NA
Transcript feature
NO;
Splicing isoform
Coding potential
Protein coding;
Definition
WD repeat-containing protein 47 isoform 1.
Similarity category
Category: Identical to known human protein(Category I).
Identical to known human protein (
NP_001136022
) [Identity/coverage = 100.0%/100.0%] to Homo sapiens protein.
Experimental evidence
Protein evidence
PubMed ID
NA
Gene family/group
H-Inv gene family/group ID
NA
Gene family/group name
NA
Evidence motif (InterPro) ID
NA
Gene symbol/name
HGNC symbol
WDR47
HGNC aliases
NA
HGNC name
WD repeat domain 47
DDBJ
NA
UniProt
NA
EC number
NA
GGDB
(GlycoGene Database)
Gene symbol
NA
Familly
NA
Designation
NA
Expression
NA
KEGG metabolic pathway
NA
Protein-protein interaction (PPI)
H-Inv protein ID
HIP000353499
No. of interaction
NA
Interaction partner(s)
NA
BIND
NA
DIP
NA
MINT
NA
HPRD
NA
IntAct
NA
Database links
RefSeq
NM_014969
;
NM_001142550
;
NM_001142551
;
Ensembl
ENST00000357672
;
ENST00000361054
;
ENST00000369962
;
ENST00000369965
;
ENST00000400794
;
ENST00000528747
;
ENST00000529074
;
ENST00000530772
;
ENST00000531337
;
Entrez Gene
Entrez Gene ID:22911
;
KEGG GENES
KEGG GENES(22911)
;
GeneCard
WDR47
;
*GeneCards is provided free to academic non-profit institutions.
etc
Human-Gene diversity Of Life-style related Diseases
;
Curation status
Auto-annotated
Notes
NA
Related H-InvDB links
Gene family;
Similarity Search Tool;
TACT
;
fRNAdb (The functional RNA database) : overlapping fRNAdb entries with H-InvDB transcripts based on the location of the genome.
NA
Subcellular localization information
Last modified:20-Apr-2012
WoLF PSORT
nuclear; cytosol;
Target P
Other
SOSUI
soluble protein
TMHMM
soluble protein
PTS1
Not targeted
Related H-InvDB links
LIFEdb;
JRE-1.4.0 or later is required.
Download JRE at
Sun's web site.
Protein structure information (GTOP)
Last modified:20-Apr-2012
Start
End
PDB_ID
E-value
Identity
Coverage
SCOP_ID
13
76
1uujA
3e-11
19.7
61/76
a.221.1.1
602
923
1nr0A1
4e-29
16.2
296/311
b.69.4.1
Related H-InvDB links
GTOP
Disease/pathology information
Last modified:20-Apr-2012
Disease relation
Disease name:NA
Related information in OMIM
OMIM ID:NA Title:NA
Co-localized orphan diseases
OMIM ID:
115665
;
116600
;
155600
;
600975
;
605225
;
605606
;
606788
;
606852
;
606928
;
607317
;
607671
;
608543
;
608553
;
608995
;
610320
;
612367
;
612596
;
Disease related mutation
NA
Literature-Extracted GENe-Disease Associations (LEGENDA)
Gene name
Entrez Gene ID:(22911)
Disease
Entrez Gene ID:(22911)
Substance
Entrez Gene ID:(22911)
Related H-InvDB links
DiseaseInfo Viewer
;
LEGENDA
;
Polymorphism (SNP, indel), microsatellite (Short Tandem Repeat, STR) and repeat information
Single Nucleotide Polymorphism (SNP) and indel
Location
Variation
dbSNP ID
Strand
CDS/UTR
Translation
334 .. 334
T/C
rs112410635
-
5'UTR
394 .. 394
C/T
rs55730953
-
5'UTR
1328 .. 1328
A/G
rs79997809
-
CDS
Nonsynonymous[Lys67Glu]
1764 .. 1764
C/T
rs113434051
-
CDS
Nonsynonymous[Ala212Val]
2017 .. 2017
T/C
rs41299563
-
CDS
Synonymous[Gly296Gly]
2366 .. 2366
A/C
rs1538137
-
CDS
Nonsynonymous[Asn413His]
2380 .. 2380
C/T
rs76370963
-
CDS
Synonymous[Thr417Thr]
2418 .. 2418
G/T
rs74576781
-
CDS
Nonsynonymous[Gly430Val]
2715 ^ 2716
-/C
rs35988108
-
CDS
3282 .. 3282
T/G
rs112609892
-
3'UTR
3475 .. 3475
A/G
rs507776
-
3'UTR
3693 .. 3693
T/A
rs114053618
-
3'UTR
3698 .. 3698
T/C
rs11803800
-
3'UTR
4118 .. 4118
G/C
rs12068536
-
3'UTR
Microsatellite (Short Tandem Repeat, STR)
No data available
Microsatellite: Human-Gene diversity Of Life-style related Diseases (H-GOLD)
No data available
Repeat
No data available
Database links
Human-Gene diversity Of Life-style related Diseases(H-GOLD)
;
Related H-InvDB links
VaryGene
;
Repeat Mask Viewer
;