2 Vraag 2: Lion King
2.1 A
Pas de functie van vraag1 aan zodat je in iedere ensemble dataset kan zoeken naar geassocieerde filters en attributes. De functie heeft drie argumenten: 1. ensembl dataset 2. zoekpatroon (regex) voor attributes 3. zoekpatroon (regex) voor filters
Bijvoorbeeld: functie(“dataset”, ” gene_id” , ”human”)
<- function(dataset, attributes_pattern, filter_pattern){
find2 <- useEnsembl(biomart = "genes", dataset)
mart
<- searchFilters(mart, filter_pattern)
filters <- searchAttributes(mart, attributes_pattern)
attributes
head(filters) %>% print()
head(attributes) %>% print()
}
find2(dataset = "hsapiens_gene_ensembl", "gene_id", "human")
## name description
## 149 with_illumina_humanht_12_v3 With ILLUMINA HumanHT 12 V3 probe ID(s)
## 150 with_illumina_humanht_12_v4 With ILLUMINA HumanHT 12 V4 probe ID(s)
## 151 with_illumina_humanref_8_v3 With ILLUMINA HumanRef 8 V3 probe ID(s)
## 152 with_illumina_humanwg_6_v1 With ILLUMINA HumanWG 6 V1 probe ID(s)
## 153 with_illumina_humanwg_6_v2 With ILLUMINA HumanWG 6 V2 probe ID(s)
## 154 with_illumina_humanwg_6_v3 With ILLUMINA HumanWG 6 V3 probe ID(s)
## name description page
## 1 ensembl_gene_id Gene stable ID feature_page
## 2 ensembl_gene_id_version Gene stable ID version feature_page
## 82 entrezgene_id NCBI gene (formerly Entrezgene) ID feature_page
## 106 wikigene_id WikiGene ID feature_page
## 204 ensembl_gene_id Gene stable ID structure
## 205 ensembl_gene_id_version Gene stable ID version structure
2.2 B
Voor iedere dataset zoek de volgende attribute en filter zoals aangegeven in de tabel. Zoek eerst de namen op van de ensembl dataset voor de aangegeven organismen.
Dataset | Attribute | Filter |
---|---|---|
Leeuw | protein | chromosome |
Baboon | protein | chromosome |
Olifant | protein | chromosome |
Let op: Ga niet 3 keer de functie uitvoeren met de aangegeven argumenten. Gebruik een R functie die iteraties kan uitvoeren.
<- useEnsembl("genes")
martALL searchDatasets(martALL, "(L|l)ion")
## dataset description version
## 147 pleo_gene_ensembl Lion genes (PanLeo1.0) PanLeo1.0
searchDatasets(martALL, "(B|b)aboon")
## dataset description version
## 139 panubis_gene_ensembl Olive baboon genes (Panu_3.0) Panu_3.0
searchDatasets(martALL, "(E|e)lephant")
## dataset description
## 42 cmilii_gene_ensembl Elephant shark genes (Callorhinchus_milii-6.1.3)
## 85 lafricana_gene_ensembl Elephant genes (Loxafr3.0)
## version
## 42 Callorhinchus_milii-6.1.3
## 85 Loxafr3.0
<- c("pleo_gene_ensembl", "panubis_gene_ensembl", "lafricana_gene_ensembl")
DS
for (x in DS) {
print(x)
find2(x, "protein", "chromosome")
}
## [1] "pleo_gene_ensembl"
## name description
## 1 chromosome_name Chromosome/scaffold name
## name
## 30 peptide_version
## 41 protein_id
## 120 peptide_version
## 160 peptide_version
## 173 cabingdonii_homolog_ensembl_peptide
## 177 cabingdonii_homolog_canonical_transcript_protein
## description page
## 30 Version (protein) feature_page
## 41 INSDC protein ID feature_page
## 120 Version (protein) structure
## 160 Version (protein) homologs
## 173 Abingdon island giant tortoise protein or transcript stable ID homologs
## 177 Query protein or transcript ID homologs
## [1] "panubis_gene_ensembl"
## name description
## 1 chromosome_name Chromosome/scaffold name
## name
## 30 peptide_version
## 44 protein_id
## 167 peptide_version
## 207 peptide_version
## 220 cabingdonii_homolog_ensembl_peptide
## 224 cabingdonii_homolog_canonical_transcript_protein
## description page
## 30 Version (protein) feature_page
## 44 INSDC protein ID feature_page
## 167 Version (protein) structure
## 207 Version (protein) homologs
## 220 Abingdon island giant tortoise protein or transcript stable ID homologs
## 224 Query protein or transcript ID homologs
## [1] "lafricana_gene_ensembl"
## name description
## 1 chromosome_name Chromosome/scaffold name
## name
## 30 peptide_version
## 42 protein_id
## 132 peptide_version
## 172 peptide_version
## 185 cabingdonii_homolog_ensembl_peptide
## 189 cabingdonii_homolog_canonical_transcript_protein
## description page
## 30 Version (protein) feature_page
## 42 INSDC protein ID feature_page
## 132 Version (protein) structure
## 172 Version (protein) homologs
## 185 Abingdon island giant tortoise protein or transcript stable ID homologs
## 189 Query protein or transcript ID homologs