Linux-基础
linux基础
ssh负责把命令传输到服务器上
SFTP负责把文件传输到服务器上
服务器本质上就是一台远程的电脑,大多数服务器安装的系统是Linux系统。处理大型数据时就需要配置较高的服务器,比如生物信息学中的NGS组学测序数据上游处理就需要服务器。通常我们使用服务器是命令行远程访问而不是桌面操作。 Linux服务器的优点在于允许多用户同时访问。
细节操作
·修改Termius 选中复制 右键粘贴
·输入exit退出termius (快捷键controlcontrol+D) 按上键重新登陆
文件夹的管理或者路径有关的符号
· . 当前目录
·..上一级目录
·~家目录:每个用户的家目录都不同
·/ 只有当/在路径的最前面时才是根目录,其他位置的/都是目录层级分隔符
Linux 命令格式:命令+参数+文件
command -option parameter 中间通过空格隔开
·command是命令名,相应功能的单词或缩写
·[]代表有时候可以省略
·- option:选项,用来对命令进行控制,也可以省略
两种格式:-h,--help
·parameter:传给命令的参数,可以是零个、一个或者多个
·FILE:要处理的文件 #举例看图1
文件夹与文件夹管理命令
·pwd ls cd mkdir touch mv rm cp tar ln #看图2
pwd 操作看图3
ls 具体看图4,5 ll -thr (具体信息 常用)
*匹配任意多个字符(0~正无穷)
?匹配任何单个字符
Mar402 13:10:27 ~
$ ls *.txt
readme.txt
Mar402 13:10:50 ~
$ ls ??????.txt
readme.txt
文件属性 ll -h (图6)
文件权限 (图7)
cd 具体看图8
绝对路径相对路径
·绝对路径:从根目录开始引起的全路径
·相对路径:相对于当前工作目录的路径
创建文件夹 图9
Mar402 13:24:33 /home/t_linux
$ cd ~
Mar402 13:26:18 ~
$ ls
Data Data.tar.gz readme.txt
Mar402 13:26:19 ~
$ mkdir test #创建一个名为test的文件夹
Mar402 13:27:13 ~
$ ls
Data Data.tar.gz readme.txt test
Mar402 13:27:28 ~
$ mkdir -p test1/test2 #创建一个名为test1的文件夹,里面包含test2
Mar402 13:28:05 ~
$ tree #用tree查看
.
├── Data
│ ├── example.fa
│ ├── example.fq
│ ├── example_gene.gtf
│ ├── example.gtf
│ ├── Homo_sapiens.GRCh38.102.chromosome.Y.gff3.gz
│ ├── md5.txt
│ ├── readme.txt
│ ├── reads.1.fq.gz
│ └── reads.2.fq.gz
├── Data.tar.gz
├── readme.txt
├── test
└── test1
└── test2
4 directories, 11 files
创建新文件 touch
Mar402 13:28:07 ~
$ touch file
Mar402 13:32:40 ~
$ tree
.
├── Data
│ ├── example.fa
│ ├── example.fq
│ ├── example_gene.gtf
│ ├── example.gtf
│ ├── Homo_sapiens.GRCh38.102.chromosome.Y.gff3.gz
│ ├── md5.txt
│ ├── readme.txt
│ ├── reads.1.fq.gz
│ └── reads.2.fq.gz
├── Data.tar.gz
├── file
├── readme.txt
├── test
└── test1
└── test2
4 directories, 12 files
文件的移动和重命名 看图10 图11
文件删除 图12
Mar402 14:15:15 ~
$ ls
Data Data.tar.gz readme.txt test test1 test2
Mar402 14:24:29 ~
$ rm Data.tar.gz #删文件/压缩包是 rm
Mar402 14:24:41 ~
$ ls
Data readme.txt test test1 test2
Mar402 14:24:43 ~
$ rm -r test1 #删文件夹+ -r
Mar402 14:25:09 ~
$ ls
Data readme.txt test test2
Mar402 14:25:10 ~
$ rm -r test2
Mar402 14:26:04 ~ #打开test文件 看里面的文件 再问你删不删
$ rm -i -r test
rm: remove directory 'test'? y
Mar402 14:26:48 ~
$ ls
Data readme.txt
文件的复制粘贴 cp
Mar402 14:31:58 ~
$ ls
Data readme.txt test1
Mar402 14:31:59 ~
$ cp readme.txt test1 #复制readme.txt到 test1中
Mar402 14:32:14 ~
$ ls test1/ #打开test1 有readme.txt这个文件
readme.txt
Mar402 14:32:19 ~
$ cp readme.txt test1/read #复制readme.txt到 test1中并重命名为read
Mar402 14:32:47 ~
$ ls test1/
read readme.txt #test1中存在read这个文件,内容和readme.txt是一样的
Mar402 14:33:17 ~
$ tree
.
├── Data
│ ├── example.fa
│ ├── example.fq
│ ├── example_gene.gtf
│ ├── example.gtf
│ ├── Homo_sapiens.GRCh38.102.chromosome.Y.gff3.gz
│ ├── md5.txt
│ ├── readme.txt
│ ├── reads.1.fq.gz
│ └── reads.2.fq.gz
├── readme.txt
└── test1
├── read
└── readme.txt
2 directories, 12 files
ln:link 链接,分软连接(常用)和硬连接(默认)图13
文件的压缩或者解压缩 图14
tar -zxvf 加文件 解压
tar -zxcf 加文件 压缩
Mar402 15:11:46 ~/mydir
$ ls
Data.tar.gz
Mar402 15:11:49 ~/mydir
$ tar -zxvf Data.tar.gz #解压
Data/
Data/reads.1.fq.gz
Data/example_gene.gtf
Data/example.fq
Data/example.fa
Data/reads.2.fq.gz
Data/example.gtf
Data/readme.txt
Data/md5.txt
Data/Homo_sapiens.GRCh38.102.chromosome.Y.gff3.gz
Mar402 15:11:54 ~/mydir
$ tar -zcvf test.tar.gz #压缩
tar: Cowardly refusing to create an empty archive #当前所在目录是mydir 里面没有文件,所以不行
Try 'tar --help' or 'tar --usage' for more information.
Mar402 15:16:04 ~/mydir #回到上一级目录
$ cd ../
Mar402 15:16:10 ~
$ pwd
/trainee/Mar402
Mar402 15:16:14 ~
$ ls
Data mydir readme.txt samtools-1.14.tar.bz2 test1 #这一级目录有这么多文件
Mar402 15:17:00 ~
$ tar -zcvf test.tar.gz #按Tab键 显示有一下这些文件,复制粘贴 进行压缩
.bash_history Data/ .profile test1/
.bashrc .gnupg/ readme.txt
.cache/ mydir/ samtools-1.14.tar.bz2
$ tar -zcvf test.tar.gz test1/ samtools-1.14.tar.bz2 readme.txt
test1/
test1/readme.txt
test1/read
samtools-1.14.tar.bz2
readme.txt
Mar402 15:20:09 ~
$ ls #已经压缩完成
Data mydir readme.txt samtools-1.14.tar.bz2 test1 test.tar.gz
Mar402 15:20:48 ~
$ ll -thr #查看该目录下具体信息
total 56K
drwxrwxr-x 2 Mar402 Mar402 4.0K Oct 25 2021 Data/
-rw-r--r-- 1 Mar402 root 207 Mar 20 14:31 readme.txt
-rw-r--r-- 1 Mar402 root 807 Mar 20 14:31 .profile
drwx------ 3 Mar402 Mar402 4.0K Mar 20 19:51 .gnupg/
drwx------ 2 Mar402 Mar402 4.0K Mar 20 19:51 .cache/
-rw-r--r-- 1 Mar402 root 3.3K Mar 20 20:53 .bashrc
drwxrwxr-x 2 Mar402 Mar402 4.0K Mar 25 14:32 test1/
lrwxrwxrwx 1 Mar402 Mar402 35 Mar 25 14:50 samtools-1.14.tar.bz2 -> /home/t_linux/samtools-1.14.tar.bz2
drwxrwxr-x 3 Mar402 Mar402 4.0K Mar 25 15:11 mydir/
drwxr-xr-x 7 Mar402 trainee 4.0K Mar 25 15:19 ./
-rw-rw-r-- 1 Mar402 Mar402 430 Mar 25 15:20 test.tar.gz
drwxr-xr-x 173 root root 12K Mar 25 15:20 ../
-rw------- 1 Mar402 Mar402 3.8K Mar 25 15:20 .bash_history
其他压缩和解压命令 图15
zip 和 unzip:用于压缩和解压缩 *zip文件
gzip 和 gunzip:用于压缩和解压缩 *gz文件
bzip2 和 bunzip2:用于压缩和解压缩 *bz2文件
打包和压缩的区别 先打包后压缩
通用的解压命令 tar -xf +文件
打包:(tar)指将一大堆文件或目录变成一个总的文件
压缩:将一个大的文件通过一些压缩算法变成一个小文件
命令总结 图16
常用的Linux快捷键 图17
linux 命令手册 http://linux.51yip.com/
Linux书籍 https://wizardforcel.gitbooks.io/vbird-linux-basic-4e/content/
文件查看 cat
Mar402 10:52:03 ~
$ ls
Da Data Miniconda3-latest-Linux-x86_64.sh mydir readme.txt
Mar402 11:04:22 ~
$ cat readme.txt #打印出所有内容,注意文件的大小
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 11:04:30 ~
$ cat -A readme.txt #打印出所有内容包括符号 在文件/行的末尾会有$
Welcome to Biotrainee() !$
This is your personal account in our Cloud.$
Have a fun with it.$
Please feel free to contact with me( email to jmzeng1314@163.com )$
(http://www.biotrainee.com/thread-1376-1-1.html)$
$
Mar402 11:08:56 ~
$ cat -n readme.txt #标记文件有多少行
1 Welcome to Biotrainee() !
2 This is your personal account in our Cloud.
3 Have a fun with it.
4 Please feel free to contact with me( email to jmzeng1314@163.com )
5 (http://www.biotrainee.com/thread-1376-1-1.html)
6
Mar402 11:09:08 ~
$ cat -b readme.txt #标记文件有多少行,不含空行
1 Welcome to Biotrainee() !
2 This is your personal account in our Cloud.
3 Have a fun with it.
4 Please feel free to contact with me( email to jmzeng1314@163.com )
5 (http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 11:09:16 ~
$ ls
Da Data Miniconda3-latest-Linux-x86_64.sh mydir readme.txt
Mar402 11:11:27 ~
$ cat > file #在file这个文件中输入了内容,>是重定向,所有打印到屏幕上的内容都叫做打印到标准输出流里,用>将内容输入到文件中,就是更改了输出流。输错了没有办法更改
1
2
3
^C
Mar402 11:11:38 ~
$ cat file #查看file这个文件
1
2
3
Mar402 11:11:45 ~
$ ls
Da Data file Miniconda3-latest-Linux-x86_64.sh mydir readme.txt #这个文件file就存在于当前目录下了
zcat:可以查看压缩的文本文件 tac:逆向查看
Mar402 12:13:45 ~
$ zcat Data/reads.1.fq.gz
head/tail -n :查看文件的前/后n行,默认10行
Mar402 12:22:15 ~
$ head Data/example.fa
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA
TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC
ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG
CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC
AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTG
AAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT
GACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTT
Mar402 12:22:46 ~
$ head -2 Data/example.fa #看前两行
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
Mar402 12:23:02 ~
$ head -1 Data/example.fa #看第一行
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
Mar402 12:23:08 ~
$ head -1 Data/example.fa | rev # 这一行倒过来
emoneg etelpmoc ,5561GM .rtsbus 21-K .rts iloc aihcirehcsE |3.319000_CN|fer|438305655|ig>
# more 逐页查看 #空格键-翻页;回车-下一行
$ more Data/example.fq
# less 查看文件 具体看图18
一般用 less -NS (单行显示);在最后写/+关键词 ->可查找关键词(n往下看,N往上看,G跳转到末尾,gg跳转到开头)看图19
文本统计 wc
·wc -l 统计行数
·wc -w 统计字符串
·wc -c 统计字节数
$ wc -l Data/example.fq Data/example.fa Data/example.gtf #这样可以用wc-l来统计多个文件的 行数
4000 Data/example.fq
64995 Data/example.fa
237 Data/example.gtf
69232 total
文本切割 cut
cut -d 指定分隔符切割,默认\t 还可以指定字母、数字或字符为分隔符 #图22
cut -f 输出哪几列 #看图20、21
sort排序 看图23 24
$ cat Data/example.gtf | sort -k 4 -n | column -t | less -SN #排列规矩对比图24 看图25
uniq: 去除重复行,只能去除相邻的重复行!需要和sort连用
uniq -c:统计每个字符串连续出现的行数 (图26)
paste 文本合并
·paste -d 指定分隔符合并
·paste -s 按行合并
Mar402 15:53:58 ~
$ cat file #文件file
1
2
3
sjdiaf
Mar402 15:54:03 ~
$ cat > file2 #新建文件file2
asdfg
edcvfr
ikmnju
^C
Mar402 15:54:29 ~
$ cat file file2 #纵合并file file2
1
2
3
sjdiaf
asdfg
edcvfr
ikmnju
Mar402 15:54:54 ~
$ cat file file2 > file3 #综合并的file file2 为file3
Mar402 15:55:07 ~
$ cat file3
1
2
3
sjdiaf
asdfg
edcvfr
ikmnju
Mar402 15:55:20 ~
$ paste file file2 #paste 横向合并file file2
1 asdfg
2 edcvfr
3 ikmnju
sjdiaf
Mar402 16:51:31 ~
$ paste -s file file2 #paste -s 列变行
1 2 3 sjdiaf
asdfg edcvfr ikmnju
# past 用法2
Mar402 15:55:54 ~
$ seq 20 #seq 用1排到20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Mar402 16:00:51 ~
$ seq 20 | paste - - - - #paste建立成矩阵
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
17 18 19 20
tr:字符替换
·tr '<pre>' '<dest>'
·tr -d:删除指定字符
·tr -s:缩减连续重复字符
Mar402 16:01:03 ~
$ cat readme.txt | tr '[a-z]' '[A-Z]' #小写a-z替换为大写
WELCOME TO BIOTRAINEE() !
THIS IS YOUR PERSONAL ACCOUNT IN OUR CLOUD.
HAVE A FUN WITH IT.
PLEASE FEEL FREE TO CONTACT WITH ME( EMAIL TO JMZENG1314@163.COM )
(HTTP://WWW.BIOTRAINEE.COM/THREAD-1376-1-1.HTML)
Mar402 16:20:59 ~
$ cat readme.txt | tr 'a' 'A' #把a图换成A
Welcome to BiotrAinee() !
This is your personAl Account in our Cloud.
HAve A fun with it.
PleAse feel free to contAct with me( emAil to jmzeng1314@163.com )
(http://www.biotrAinee.com/threAd-1376-1-1.html)
Mar402 16:26:35 ~
$ cat readme.txt | tr ' ' '\t' #把空格转换为tab(\t)键
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 16:27:22 ~
$ cat readme.txt | tr ' ' '\t' | cat -A #cat -A可以用于查看tab键的存在^
Welcome^Ito^IBiotrainee()^I!$
This^Iis^Iyour^Ipersonal^Iaccount^Iin^Iour^ICloud.$
Have^Ia^Ifun^Iwith^Iit.$
Please^Ifeel^Ifree^Ito^Icontact^Iwith^Ime(^Iemail^Ito^Ijmzeng1314@163.com^I)$
(http://www.biotrainee.com/thread-1376-1-1.html)$
$
Mar402 16:27:28 ~
$ cat readme.txt | cat -A
Welcome to Biotrainee() !$
This is your personal account in our Cloud.$
Have a fun with it.$
Please feel free to contact with me( email to jmzeng1314@163.com )$
(http://www.biotrainee.com/thread-1376-1-1.html)$
$
Mar402 16:28:03 ~
$ cat readme.txt | tr -d ' ' #删除空格
WelcometoBiotrainee()!
ThisisyourpersonalaccountinourCloud.
Haveafunwithit.
Pleasefeelfreetocontactwithme(emailtojmzeng1314@163.com)
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 16:29:04 ~
$ cat readme.txt | tr -d '\n'
Welcome to Biotrainee() !This is your personal account in our Cloud.Have a fun with it.Please feel free to contact with me( email to jmzeng1314@163.com )(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 16:29:23 ~
$ cat Data/example.gtf | head -20 | cut -f 3 | sort | uniq -c #这些字母前存在着大量的空白
1 CDS
10 exon
1 gene
1 start_codon
2 transcript
5 UTR
Mar402 16:38:41 ~
$ cat Data/example.gtf | head -20 | cut -f 3 | sort | uniq -c | cut -d ' ' -f 6 #取前面的第六个空格,发现他们前面的空格并不是一样的
10
Mar402 16:39:04 ~
$ cat Data/example.gtf | head -20 | cut -f 3 | sort | uniq -c | cut -d ' ' -f 7
1
exon
1
1
2
5
Mar402 16:39:11 ~
$ cat Data/example.gtf | head -20 | cut -f 3 | sort | uniq -c |tr -s ' ' #tr-s 用于删除连续的字符
1 CDS
10 exon
1 gene
1 start_codon
2 transcript
5 UTR
Mar402 16:39:39 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' #下面是为了计算第二列想加的数值(第一列是空格)
29 CDS
111 exon
20 gene
7 start_codon
9 stop_codon
34 transcript
27 UTR
Mar402 16:49:04 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' | cut -d ' ' -f 2 #以空格为分隔符将第二列提取出来
29
111
20
7
9
34
27
Mar402 16:50:33 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' | cut -d ' ' -f 2 | paste -s #paste -s 横向拼接
29 111 20 7 9 34 27
Mar402 16:53:37 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' | cut -d ' ' -f 2 | paste -s -d ':' #paste -s -d以":"将他们连在一起
29:111:20:7:9:34:27
Mar402 16:53:53 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' | cut -d ' ' -f 2 | paste -s -d '+'
29+111+20+7+9+34+27
Mar402 16:53:58 ~
$ cat Data/example.gtf | cut -f 3 | sort | uniq -c | tr -s ' ' | cut -d ' ' -f 2 | paste -s -d '+' | bc #最终求和 bc
237
文件内容查看 边界小结 图27
练习7
Mar402 12:52:01 ~
$ less Data/example.gtf | wc
237 6944 77781
Mar402 12:52:17 ~
$ cat Data/example.gtf | cut -f 9 | head
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENSG00000223972"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1"; level 2; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
Mar402 12:52:24 ~
$ cat Data/example.gtf | cut -f 9 | cut -d ';' -f 1 | head
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
gene_id "ENSG00000223972"
Mar402 12:53:26 ~
$ cat Data/example.gtf | cut -f 9 | cut -d ';' -f 1 | sort | uniq -c
8 gene_id "ENSG00000177693"
15 gene_id "ENSG00000184731"
3 gene_id "ENSG00000221311"
3 gene_id "ENSG00000222623"
19 gene_id "ENSG00000223972"
4 gene_id "ENSG00000227061"
83 gene_id "ENSG00000227232"
8 gene_id "ENSG00000233004"
3 gene_id "ENSG00000233750"
15 gene_id "ENSG00000237613"
12 gene_id "ENSG00000237683"
18 gene_id "ENSG00000238009"
12 gene_id "ENSG00000239368"
4 gene_id "ENSG00000239906"
4 gene_id "ENSG00000239945"
3 gene_id "ENSG00000240361"
3 gene_id "ENSG00000240786"
4 gene_id "ENSG00000241599"
8 gene_id "ENSG00000241860"
8 gene_id "ENSG00000243485"
Mar402 12:53:46 ~
$ cat Data/example.gtf | cut -f 9 | cut -d ';' -f 1 | sort | uniq -c | tr -s ' ' '\t'
8 gene_id "ENSG00000177693"
15 gene_id "ENSG00000184731"
3 gene_id "ENSG00000221311"
3 gene_id "ENSG00000222623"
19 gene_id "ENSG00000223972"
4 gene_id "ENSG00000227061"
83 gene_id "ENSG00000227232"
8 gene_id "ENSG00000233004"
3 gene_id "ENSG00000233750"
15 gene_id "ENSG00000237613"
12 gene_id "ENSG00000237683"
18 gene_id "ENSG00000238009"
12 gene_id "ENSG00000239368"
4 gene_id "ENSG00000239906"
4 gene_id "ENSG00000239945"
3 gene_id "ENSG00000240361"
3 gene_id "ENSG00000240786"
4 gene_id "ENSG00000241599"
8 gene_id "ENSG00000241860"
8 gene_id "ENSG00000243485"
Mar402 19:18:50 ~
$ cat Data/md5.txt #原文件格式不工整
fastq_md5 fastq_aspera
d57df747bc142e9850074d512ab9d6db;3331c6a9e0183ff9d398a3292dd45f66 SRR1039508_1.fastq.gz;SRR1039508_2.fastq.gz
49400c5685f36f830a277a59004b119d;ab4410a432cc18c1b9f10f93634e5310 SRR1039509_1.fastq.gz;SRR1039509_2.fastq.gz
d2c2d92c67c943648fdde6c70bc0d920;3e4223e08b97f37f3da17d686739e75c SRR1039510_1.fastq.gz;SRR1039510_2.fastq.gz
4073b1519608c24c0c1119b580dfd9eb;2fcb23d5fb63e322d80cd3cab75faa0b SRR1039511_1.fastq.gz;SRR1039511_2.fastq.gz
a35f30576f25ea548c7b3a28895a81cf;83bbe3c587d9477938826ea19c53a281 SRR1039512_1.fastq.gz;SRR1039512_2.fastq.gz
b3073b5b057f24208ac1853fdd4b5875;945cb34259d6dbf0362fe9018f769de4;ecb43490d03c9b325352e70488d58611 SRR1039513.fastq.gz;SRR1039513_1.fastq.gz;SRR1039513_2.fastq.gz
ae35fe0ce13badacc48c65717e811528;9ef4fe59d6378c513f933e24d12f6047 SRR1039514_1.fastq.gz;SRR1039514_2.fastq.gz
929b988eb5730eba77aeac98bf8be35f;c674d2ea79835165828b37258abbc925;5640a85f2c181d4886e905e74a32f041 SRR1039515.fastq.gz;SRR1039515_1.fastq.gz;SRR1039515_2.fastq.gz
8f97b3dc8170ecd6fffb39101c3e5bf5;2c4d2ba3b812f14bce25966c98b5b5df;8599c02799338b9514e8d0077a8409e4 SRR1039516.fastq.gz;SRR1039516_1.fastq.gz;SRR1039516_2.fastq.gz
1f2796f07033ec3bfab0981bd0674bb9;008ba2b3b589d553e3e9f8890d5481c2 SRR1039517_1.fastq.gz;SRR1039517_2.fastq.gz
64d1444ad727f48066aeb6ad314d9190;a24eea863bdca0284591fcd5eb076a93 SRR1039518_1.fastq.gz;SRR1039518_2.fastq.gz
f11f41c013ffaf3a031c9836ce81e6ef;9283f111ef774248f6f666e4bf2b1f81;9bcb6c9675631b1dcb8b07f6916d546c SRR1039519.fastq.gz;SRR1039519_1.fastq.gz;SRR1039519_2.fastq.gz
d8251c87ba3c803d4344c2b24c77b19d;ca8e0014e7ba56982adc37439cea0755;62838f21e66ec78030b51ee6019420ef SRR1039520.fastq.gz;SRR1039520_1.fastq.gz;SRR1039520_2.fastq.gz
637e08d030778c6581731647f3c3d8cc;4be82ad33d7d4990bed3c4bc701dc070;435aa5e48ba77e4c42218930a0be0de1 SRR1039521.fastq.gz;SRR1039521_1.fastq.gz;SRR1039521_2.fastq.gz
789e86036c81a85d2c1f014f79822d64;54c572cead4074b126f0b81b344af1be;c461a163b72a71efb4027045e6b4d2f6 SRR1039522.fastq.gz;SRR1039522_1.fastq.gz;SRR1039522_2.fastq.gz
ae33f7f6d536d020a2562b8be6e9cc33;083213dc45820db2eb62d66b89e77ce9 SRR1039523_1.fastq.gz;SRR1039523_2.fastq.gz
Mar402 20:04:38 ~
$ cat Data/md5.txt | cut -f 1 #先将第一列取出
fastq_md5
d57df747bc142e9850074d512ab9d6db;3331c6a9e0183ff9d398a3292dd45f66
49400c5685f36f830a277a59004b119d;ab4410a432cc18c1b9f10f93634e5310
d2c2d92c67c943648fdde6c70bc0d920;3e4223e08b97f37f3da17d686739e75c
4073b1519608c24c0c1119b580dfd9eb;2fcb23d5fb63e322d80cd3cab75faa0b
a35f30576f25ea548c7b3a28895a81cf;83bbe3c587d9477938826ea19c53a281
b3073b5b057f24208ac1853fdd4b5875;945cb34259d6dbf0362fe9018f769de4;ecb43490d03c9b325352e70488d58611
ae35fe0ce13badacc48c65717e811528;9ef4fe59d6378c513f933e24d12f6047
929b988eb5730eba77aeac98bf8be35f;c674d2ea79835165828b37258abbc925;5640a85f2c181d4886e905e74a32f041
8f97b3dc8170ecd6fffb39101c3e5bf5;2c4d2ba3b812f14bce25966c98b5b5df;8599c02799338b9514e8d0077a8409e4
1f2796f07033ec3bfab0981bd0674bb9;008ba2b3b589d553e3e9f8890d5481c2
64d1444ad727f48066aeb6ad314d9190;a24eea863bdca0284591fcd5eb076a93
f11f41c013ffaf3a031c9836ce81e6ef;9283f111ef774248f6f666e4bf2b1f81;9bcb6c9675631b1dcb8b07f6916d546c
d8251c87ba3c803d4344c2b24c77b19d;ca8e0014e7ba56982adc37439cea0755;62838f21e66ec78030b51ee6019420ef
637e08d030778c6581731647f3c3d8cc;4be82ad33d7d4990bed3c4bc701dc070;435aa5e48ba77e4c42218930a0be0de1
789e86036c81a85d2c1f014f79822d64;54c572cead4074b126f0b81b344af1be;c461a163b72a71efb4027045e6b4d2f6
ae33f7f6d536d020a2562b8be6e9cc33;083213dc45820db2eb62d66b89e77ce9
Mar402 20:08:31 ~
$ cat Data/md5.txt | cut -f 1 | tr ';' '\n' #将后面转换为换行
fastq_md5
d57df747bc142e9850074d512ab9d6db
3331c6a9e0183ff9d398a3292dd45f66
49400c5685f36f830a277a59004b119d
ab4410a432cc18c1b9f10f93634e5310
d2c2d92c67c943648fdde6c70bc0d920
3e4223e08b97f37f3da17d686739e75c
4073b1519608c24c0c1119b580dfd9eb
2fcb23d5fb63e322d80cd3cab75faa0b
a35f30576f25ea548c7b3a28895a81cf
83bbe3c587d9477938826ea19c53a281
b3073b5b057f24208ac1853fdd4b5875
945cb34259d6dbf0362fe9018f769de4
ecb43490d03c9b325352e70488d58611
ae35fe0ce13badacc48c65717e811528
9ef4fe59d6378c513f933e24d12f6047
929b988eb5730eba77aeac98bf8be35f
c674d2ea79835165828b37258abbc925
5640a85f2c181d4886e905e74a32f041
8f97b3dc8170ecd6fffb39101c3e5bf5
2c4d2ba3b812f14bce25966c98b5b5df
8599c02799338b9514e8d0077a8409e4
1f2796f07033ec3bfab0981bd0674bb9
008ba2b3b589d553e3e9f8890d5481c2
64d1444ad727f48066aeb6ad314d9190
a24eea863bdca0284591fcd5eb076a93
f11f41c013ffaf3a031c9836ce81e6ef
9283f111ef774248f6f666e4bf2b1f81
9bcb6c9675631b1dcb8b07f6916d546c
d8251c87ba3c803d4344c2b24c77b19d
ca8e0014e7ba56982adc37439cea0755
62838f21e66ec78030b51ee6019420ef
637e08d030778c6581731647f3c3d8cc
4be82ad33d7d4990bed3c4bc701dc070
435aa5e48ba77e4c42218930a0be0de1
789e86036c81a85d2c1f014f79822d64
54c572cead4074b126f0b81b344af1be
c461a163b72a71efb4027045e6b4d2f6
ae33f7f6d536d020a2562b8be6e9cc33
083213dc45820db2eb62d66b89e77ce9
Mar402 20:09:07 ~
$ cat Data/md5.txt | cut -f 1 | tr ';' '\n' > tmp1 #将其保存为tmp1
Mar402 20:09:41 ~
$ cat Data/md5.txt | cut -f 2 | tr ';' '\n' > tmp2 #将其保存为tmp2
$ paste tmp1 tmp2 #将tmp1 tmp2粘在一起
fastq_md5 fastq_aspera
d57df747bc142e9850074d512ab9d6db SRR1039508_1.fastq.gz
3331c6a9e0183ff9d398a3292dd45f66 SRR1039508_2.fastq.gz
49400c5685f36f830a277a59004b119d SRR1039509_1.fastq.gz
ab4410a432cc18c1b9f10f93634e5310 SRR1039509_2.fastq.gz
d2c2d92c67c943648fdde6c70bc0d920 SRR1039510_1.fastq.gz
3e4223e08b97f37f3da17d686739e75c SRR1039510_2.fastq.gz
4073b1519608c24c0c1119b580dfd9eb SRR1039511_1.fastq.gz
2fcb23d5fb63e322d80cd3cab75faa0b SRR1039511_2.fastq.gz
a35f30576f25ea548c7b3a28895a81cf SRR1039512_1.fastq.gz
83bbe3c587d9477938826ea19c53a281 SRR1039512_2.fastq.gz
b3073b5b057f24208ac1853fdd4b5875 SRR1039513.fastq.gz
945cb34259d6dbf0362fe9018f769de4 SRR1039513_1.fastq.gz
ecb43490d03c9b325352e70488d58611 SRR1039513_2.fastq.gz
ae35fe0ce13badacc48c65717e811528 SRR1039514_1.fastq.gz
9ef4fe59d6378c513f933e24d12f6047 SRR1039514_2.fastq.gz
929b988eb5730eba77aeac98bf8be35f SRR1039515.fastq.gz
c674d2ea79835165828b37258abbc925 SRR1039515_1.fastq.gz
5640a85f2c181d4886e905e74a32f041 SRR1039515_2.fastq.gz
8f97b3dc8170ecd6fffb39101c3e5bf5 SRR1039516.fastq.gz
2c4d2ba3b812f14bce25966c98b5b5df SRR1039516_1.fastq.gz
8599c02799338b9514e8d0077a8409e4 SRR1039516_2.fastq.gz
1f2796f07033ec3bfab0981bd0674bb9 SRR1039517_1.fastq.gz
008ba2b3b589d553e3e9f8890d5481c2 SRR1039517_2.fastq.gz
64d1444ad727f48066aeb6ad314d9190 SRR1039518_1.fastq.gz
a24eea863bdca0284591fcd5eb076a93 SRR1039518_2.fastq.gz
f11f41c013ffaf3a031c9836ce81e6ef SRR1039519.fastq.gz
9283f111ef774248f6f666e4bf2b1f81 SRR1039519_1.fastq.gz
9bcb6c9675631b1dcb8b07f6916d546c SRR1039519_2.fastq.gz
d8251c87ba3c803d4344c2b24c77b19d SRR1039520.fastq.gz
ca8e0014e7ba56982adc37439cea0755 SRR1039520_1.fastq.gz
62838f21e66ec78030b51ee6019420ef SRR1039520_2.fastq.gz
637e08d030778c6581731647f3c3d8cc SRR1039521.fastq.gz
4be82ad33d7d4990bed3c4bc701dc070 SRR1039521_1.fastq.gz
435aa5e48ba77e4c42218930a0be0de1 SRR1039521_2.fastq.gz
789e86036c81a85d2c1f014f79822d64 SRR1039522.fastq.gz
54c572cead4074b126f0b81b344af1be SRR1039522_1.fastq.gz
c461a163b72a71efb4027045e6b4d2f6 SRR1039522_2.fastq.gz
ae33f7f6d536d020a2562b8be6e9cc33 SRR1039523_1.fastq.gz
083213dc45820db2eb62d66b89e77ce9 SRR1039523_2.fastq.gz
Mar402 20:10:57 ~
$ paste tmp1 tmp2 > tmp3 #粘在一起后给tmp3
Mar402 20:11:27 ~
$ mv tmp3 md5 #yi daomd5
Mar402 20:11:49 ~
$ cat md5 #查看
fastq_md5 fastq_aspera
d57df747bc142e9850074d512ab9d6db SRR1039508_1.fastq.gz
3331c6a9e0183ff9d398a3292dd45f66 SRR1039508_2.fastq.gz
49400c5685f36f830a277a59004b119d SRR1039509_1.fastq.gz
ab4410a432cc18c1b9f10f93634e5310 SRR1039509_2.fastq.gz
d2c2d92c67c943648fdde6c70bc0d920 SRR1039510_1.fastq.gz
3e4223e08b97f37f3da17d686739e75c SRR1039510_2.fastq.gz
4073b1519608c24c0c1119b580dfd9eb SRR1039511_1.fastq.gz
2fcb23d5fb63e322d80cd3cab75faa0b SRR1039511_2.fastq.gz
a35f30576f25ea548c7b3a28895a81cf SRR1039512_1.fastq.gz
83bbe3c587d9477938826ea19c53a281 SRR1039512_2.fastq.gz
b3073b5b057f24208ac1853fdd4b5875 SRR1039513.fastq.gz
945cb34259d6dbf0362fe9018f769de4 SRR1039513_1.fastq.gz
ecb43490d03c9b325352e70488d58611 SRR1039513_2.fastq.gz
ae35fe0ce13badacc48c65717e811528 SRR1039514_1.fastq.gz
9ef4fe59d6378c513f933e24d12f6047 SRR1039514_2.fastq.gz
929b988eb5730eba77aeac98bf8be35f SRR1039515.fastq.gz
c674d2ea79835165828b37258abbc925 SRR1039515_1.fastq.gz
5640a85f2c181d4886e905e74a32f041 SRR1039515_2.fastq.gz
8f97b3dc8170ecd6fffb39101c3e5bf5 SRR1039516.fastq.gz
2c4d2ba3b812f14bce25966c98b5b5df SRR1039516_1.fastq.gz
8599c02799338b9514e8d0077a8409e4 SRR1039516_2.fastq.gz
1f2796f07033ec3bfab0981bd0674bb9 SRR1039517_1.fastq.gz
008ba2b3b589d553e3e9f8890d5481c2 SRR1039517_2.fastq.gz
64d1444ad727f48066aeb6ad314d9190 SRR1039518_1.fastq.gz
a24eea863bdca0284591fcd5eb076a93 SRR1039518_2.fastq.gz
f11f41c013ffaf3a031c9836ce81e6ef SRR1039519.fastq.gz
9283f111ef774248f6f666e4bf2b1f81 SRR1039519_1.fastq.gz
9bcb6c9675631b1dcb8b07f6916d546c SRR1039519_2.fastq.gz
d8251c87ba3c803d4344c2b24c77b19d SRR1039520.fastq.gz
ca8e0014e7ba56982adc37439cea0755 SRR1039520_1.fastq.gz
62838f21e66ec78030b51ee6019420ef SRR1039520_2.fastq.gz
637e08d030778c6581731647f3c3d8cc SRR1039521.fastq.gz
4be82ad33d7d4990bed3c4bc701dc070 SRR1039521_1.fastq.gz
435aa5e48ba77e4c42218930a0be0de1 SRR1039521_2.fastq.gz
789e86036c81a85d2c1f014f79822d64 SRR1039522.fastq.gz
54c572cead4074b126f0b81b344af1be SRR1039522_1.fastq.gz
c461a163b72a71efb4027045e6b4d2f6 SRR1039522_2.fastq.gz
ae33f7f6d536d020a2562b8be6e9cc33 SRR1039523_1.fastq.gz
083213dc45820db2eb62d66b89e77ce9 SRR1039523_2.fastq.gz
Mar402 20:11:56 ~
$ cat Data/md5.txt
fastq_md5 fastq_aspera
d57df747bc142e9850074d512ab9d6db;3331c6a9e0183ff9d398a3292dd45f66 SRR1039508_1.fastq.gz;SRR1039508_2.fastq.gz
49400c5685f36f830a277a59004b119d;ab4410a432cc18c1b9f10f93634e5310 SRR1039509_1.fastq.gz;SRR1039509_2.fastq.gz
d2c2d92c67c943648fdde6c70bc0d920;3e4223e08b97f37f3da17d686739e75c SRR1039510_1.fastq.gz;SRR1039510_2.fastq.gz
4073b1519608c24c0c1119b580dfd9eb;2fcb23d5fb63e322d80cd3cab75faa0b SRR1039511_1.fastq.gz;SRR1039511_2.fastq.gz
a35f30576f25ea548c7b3a28895a81cf;83bbe3c587d9477938826ea19c53a281 SRR1039512_1.fastq.gz;SRR1039512_2.fastq.gz
b3073b5b057f24208ac1853fdd4b5875;945cb34259d6dbf0362fe9018f769de4;ecb43490d03c9b325352e70488d58611 SRR1039513.fastq.gz;SRR1039513_1.fastq.gz;SRR1039513_2.fastq.gz
ae35fe0ce13badacc48c65717e811528;9ef4fe59d6378c513f933e24d12f6047 SRR1039514_1.fastq.gz;SRR1039514_2.fastq.gz
929b988eb5730eba77aeac98bf8be35f;c674d2ea79835165828b37258abbc925;5640a85f2c181d4886e905e74a32f041 SRR1039515.fastq.gz;SRR1039515_1.fastq.gz;SRR1039515_2.fastq.gz
8f97b3dc8170ecd6fffb39101c3e5bf5;2c4d2ba3b812f14bce25966c98b5b5df;8599c02799338b9514e8d0077a8409e4 SRR1039516.fastq.gz;SRR1039516_1.fastq.gz;SRR1039516_2.fastq.gz
1f2796f07033ec3bfab0981bd0674bb9;008ba2b3b589d553e3e9f8890d5481c2 SRR1039517_1.fastq.gz;SRR1039517_2.fastq.gz
64d1444ad727f48066aeb6ad314d9190;a24eea863bdca0284591fcd5eb076a93 SRR1039518_1.fastq.gz;SRR1039518_2.fastq.gz
f11f41c013ffaf3a031c9836ce81e6ef;9283f111ef774248f6f666e4bf2b1f81;9bcb6c9675631b1dcb8b07f6916d546c SRR1039519.fastq.gz;SRR1039519_1.fastq.gz;SRR1039519_2.fastq.gz
d8251c87ba3c803d4344c2b24c77b19d;ca8e0014e7ba56982adc37439cea0755;62838f21e66ec78030b51ee6019420ef SRR1039520.fastq.gz;SRR1039520_1.fastq.gz;SRR1039520_2.fastq.gz
637e08d030778c6581731647f3c3d8cc;4be82ad33d7d4990bed3c4bc701dc070;435aa5e48ba77e4c42218930a0be0de1 SRR1039521.fastq.gz;SRR1039521_1.fastq.gz;SRR1039521_2.fastq.gz
789e86036c81a85d2c1f014f79822d64;54c572cead4074b126f0b81b344af1be;c461a163b72a71efb4027045e6b4d2f6 SRR1039522.fastq.gz;SRR1039522_1.fastq.gz;SRR1039522_2.fastq.gz
ae33f7f6d536d020a2562b8be6e9cc33;083213dc45820db2eb62d66b89e77ce9 SRR1039523_1.fastq.gz;SRR1039523_2.fastq.gz
-----来自生信技能树----
相关文章
- 编译Linux系统:GCCc程序之旅(linuxgcc-c)
- Linux:让病毒无处可逃(linux病毒)
- 语言探究Linux下C语言的运行之路(linux怎么运行c)
- 码Linux解锁PIN码:挑战极限(linux破解pin)
- 快速实现Linux FTP 客户端文件传输:ftpget命令(linuxftpget)
- 技术Linux系统的防御技术实施(linux防御)
- 利用Qt快速调用Linux程序简易指南(qt调用linux程序)
- 服务器SSH连接Linux服务器失败:解决措施(ssh无法连接linux)
- 重启Linux防火墙:一个必须知道的技巧(linux防火墙重启)
- Linux实现安卓设备远程连接(linux连接安卓)
- Linux 的精妙时间片(linux时间片)
- Linux的发展之路:几个重要的分支(linux的分支)
- Linux结构体数组:体验更便捷的结构化数据管理(linux结构体数组)
- 快速压缩GZ文件的Linux技巧(linux压缩gz文件)
- 处理Linux开发:让处理更加高效(基于linux的开发)
- Linux库链接:一个开放访问的技术宝库(linux库链接)
- 终端Linux下通过串口传输数据的终端应用(linux串口数据)
- 系统Linux的多元化: 走进分支系统的世界(linux的分支)
- 系统Linux的多元分支:探索系统发展(linux的分支)
- 网页实现操作Linux命令(web执行linux命令)
- Linux下使用正则表达式优雅地完成匹配(linux正则表达式匹配)
- 使用Linux系统配置多网卡(linux配置多网卡)
- Linux硬盘分区指南:步步深入!(linux硬盘怎么分区)
- Linux编程:初步入门教程(linux编程教程)
- 如何设置Linux系统每天定时重启来保持最佳运行状态(linux每天定时重启)