Can ChatGPT replace the scientific researchers in the area of bioinformatics in the future? The current days I conduct a project about population history in farm animals, I asked ChatGPT a few questions , it shocked me a lot. Here is our dialog.
ChatGPT写给余华的颁奖词
在《收获》杂志创办65周年庆典上,余华因《文城》作品获得最佳长篇小说奖,余华的好哥们莫言为其颁奖,莫言说要为其写一段赞语,但写了好几天也没写出来,于是让ChatGPT给他写,输入的关键词为:活着、拔牙、文城,瞬间产生了莎士比亚风格的一千多字的赞语。于是,我也向ChatGPT问了有关颁奖词的问题。
使用Plink检测基因组的ROH
ROH定义
亲本的最近祖先相同时,如果将相同的染色体片段(IBD)传递给子代后,个体的基因组会形成连续区域的纯合,这些连续的纯合位点被称为长纯合片段(run of homozygosity, ROH)。
Run of homozygosity
Run of homozygosity (ROH) refers to contiguous regions of the genome where an individual is homozygous across all sites. This arises if the haplotypes transmitted from the mother and father are identical, having in turn been inherited from a common ancestor at some point in the past.The number and length of ROH reflect individual demographic history, while the homozygosity burden can be used to investigate the genetic architecture of complex disease. CEBALLOS F.C. et al. reviewed how ROH affects population history and trait architecture in Nature Review Genetics.
基于多变量的全基因组关联分析
如果数量性状的表型值随着时间发生变化,这样的性状被称作纵向性状(longitudinal traits)(Yang et al., 2006)。如奶牛的泌乳性状,蛋鸡的产蛋量性状。纵向性状的遗传分析通常有三种策略(Yang et al., 2006):
Calling variants in RNA-seq data with GATK
Notes on calling variants in RNA-seq data with GATK
- RNA-seq includes reads mapped across splice junctions and is associated with high variability of coverage, so typical variant calling pipelines (for DNA) can lead to lots of false positives and negatives.
- GATK is currently the gold standard for calling variants in RNA-seq data. See a detailed description of their workflow here:
基于单倍型的选择信号检测方法
选择的主要效应是改变种内或者种间的变异性水平。 当一个群体中出现有利变异时,正选择作用会使有利等位基因频率在短时间内提高,由于连锁不平衡作用,有利等位基因邻近的中性位点也会倾向于一起被选择, 这种现象称为“搭乘效应”(Hitchhiking Effect)。有利等位基因在群体中频率升高会消除或者减弱与其连锁的中性位点的变异,使其多态性也降低,这一过程被称为选择性清除(Selectivesweep) 。
群体遗传学中的选择模式
在群体遗传学中,假设SNP的两个等位基因为A和a,三种基因型AA,Aa和aa的适合度分别为ωAA,ωAa和ωaa,如果三种基因型的适合度不相等,则表明有选择的发生。
A pipeline for Genome Assembly using SOAPdenovo2
SOAPdenovo2 is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes.
Genome Wide Association Analysis based on linear mixed model
Genome-wide association study (GWAS) refers to study in which a genome-wide set of genetic variants (usually refers to single nucleotide polymorphisms, SNP) are genotyped and tested for association with a certain phenotype. In the past decade, numerous genetic loci have been identified to be associated with many complex traits (diseases) via GWAS both in humans (See National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies) and animals.