R语言Biostrings包函数介绍(一)

本文介绍了R语言Biostrings包的主要功能,包括定义的常量、单一序列类、XStringViews类、序列集类以及基本的序列转换操作,如DNA到RNA的转录和互补配对。通过示例展示了如何创建和操作BString、DNAString等对象,以及进行序列比对和转换的方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一、定义的常量

> DNA_BASES 
[1] "A" "C" "G" "T"
> RNA_BASES 
[1] "A" "C" "G" "U"
> GENETIC_CODE 
TTT TTC TTA TTG TCT TCC TCA TCG TAT TAC TAA TAG TGT TGC TGA TGG CTT 
"F" "F" "L" "L" "S" "S" "S" "S" "Y" "Y" "*" "*" "C" "C" "*" "W" "L" 
CTC CTA CTG CCT CCC CCA CCG CAT CAC CAA CAG CGT CGC CGA CGG ATT ATC 
"L" "L" "L" "P" "P" "P" "P" "H" "H" "Q" "Q" "R" "R" "R" "R" "I" "I" 
ATA ATG ACT ACC ACA ACG AAT AAC AAA AAG AGT AGC AGA AGG GTT GTC GTA 
"I" "M" "T" "T" "T" "T" "N" "N" "K" "K" "S" "S" "R" "R" "V" "V" "V" 
GTG GCT GCC GCA GCG GAT GAC GAA GAG GGT GGC GGA GGG 
"V" "A" "A" "A" "A" "D" "D" "E" "E" "G" "G" "G" "G" 
attr(,"alt_init_codons")
[1] "TTG" "CTG"
> RNA_GENETIC_CODE
UUU UUC UUA UUG UCU UCC UCA UCG UAU UAC UAA UAG UGU UGC UGA UGG CUU 
"F" "F" "L" "L" "S" "S" "S" "S" "Y" "Y" "*" "*" "C" "C" "*" "W" "L" 
CUC CUA CUG CCU CCC CCA CCG CAU CAC CAA CAG CGU CGC CGA CGG AUU AUC 
"L" "L" "L" "P" "P" "P" "P" "H" "H" "Q" "Q" "R" "R" "R" "R" "I" "I" 
AUA AUG ACU ACC ACA ACG AAU AAC AAA AAG AGU AGC AGA AGG GUU GUC GUA 
"I" "M" "T" "T" "T" "T" "N" "N" "K" "K" "S" "S" "R" "R" "V" "V" "V" 
GUG GCU GCC GCA GCG GAU GAC GAA GAG GGU GGC GGA GGG 
"V" "A" "A" "A" "A" "D" "D" "E" "E" "G" "G" "G" "G" 
attr(,"alt_init_codons")
[1] "UUG" "CUG"
> AMINO_ACID_CODE
    A     R     N     D     C     Q     E     G     H     I     L 
"Ala" "Arg" "Asn" "Asp" "Cys" "Gln" "Glu" "Gly" "His" "Ile" "Leu" 
    K     M     F     P     S     T     W     Y     V     U     O 
"Lys" "Met" "Phe" "Pro" "Ser" "Thr" "Trp" "Tyr" "Val" "Sec" "Pyl" 
    B     J     Z     X 
"Asx" "Xle" "Glx" "Xaa" 
> IUPAC_CODE_MAP
     A      C      G      T      M      R      W      S      Y      K 
   "A"    "C"    "G"    "T"   "AC"   "AG"   "AT"   "CG"   "CT"   "GT" 
     V      H      D      B      N 
 "ACG"  "ACT"  "AGT"  "CGT" "ACGT" 

用于核酸和蛋白序列比对的取代矩阵,这些数据需用户自行载入(最后三个是函数,但可以不设参数):data(BLOSUM45/50/62/80/100)data(PAM30/40/70/120/250)nucleotideSubstitutionMatrix()qualitySubstitutionMatrices()errorSubstitutionMatrices()

二、包含单一序列的类

  1. XString是一种虚拟类,派生出BString、DNAString、RNAString和AAString类。
  2. BString类是一个用于存储大字符串(长字符序列)并使其操作简单而有效的通用容器。DNAString、RNAString和AAString类是类似的容器,但更侧重于存储DNA序列(DNAString)、RNA序列(RNAString)或氨基酸序列(AAString)。
  3. XString对象和标准字符串向量之间的两个主要区别是:(1)存储在XString对象中的数据不会在对象复制时复制和(2)一个XString对象只能存储一个字符串(参见XStringSet容器的一种有效的方式将一个大的字符串集合存储在一个单独的对象)。

示例:(以下对象x均为XString)

  1. BString(x="", start=1, nchar=NA):试图将x转换成BString类对象,方法是读取从x的起始
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值