NHI chromosomes map
Human Chromosomes frequencies.
ftp web site
The files downloaded contain: Gene name, Chromosomal position, AC, Swiss-Prot Entry name, MIM code, Description and are well formatted for import in Excel specifying TAB separator as rule for import.
Use this link to see the Human chromosome original files download from Expasy
The file downloaded do not contain the protein sequence in amino acid that we contruct manually starting from the column "Swiss-Prot Entry name" and using a Perl Script.
Some proteins appear duplicated.
The script download the FLAT file for each protein and put them one after the other in one single file for the chromosome considered.
The sequence is used to calculate the relative frequency for each amino acid within the protein, within the chromosomes.
R is used to create the density charts, to make the Kruskal-Wallis analysis and the Wilcoxon rank sum test with continuity correction automatically for each chromosome in a structurated way.
The R script divides the resulting data in directory, one for each chromosome.
The distribution of the amino acids within the chromosomes is not random. Some chromosomes
produce more AA of a certain type, others have a different distribution.
Our target is to understend this different behavior and study the single amino acid distribution within the chromosome in order to compare it with the more general distribution on the same amino acid on the whole human proteins.