您现在的位置是：首页 > 硬件

当前栏目

深入分析:用1K内存实现高效I/O的RandomAccessFile类的详解

内存实现详解高效深入分析 RandomAccessFile

2023-06-13 09:14:54 时间

主体：
目前最流行的J2SDK版本是1.3系列。使用该版本的开发人员需文件随机存取，就得使用RandomAccessFile类。其I/O性能较之其它常用开发语言的同类性能差距甚远，严重影响程序的运行效率。
开发人员迫切需要提高效率，下面分析RandomAccessFile等文件类的源代码，找出其中的症结所在，并加以改进优化，创建一个"性/价比"俱佳的随机文件访问类BufferedRandomAccessFile。
在改进之前先做一个基本测试：逐字节COPY一个12兆的文件（这里牵涉到读和写）。

读写耗用时间（秒） RandomAccessFile RandomAccessFile 95.848 BufferedInputStream+DataInputStream BufferedOutputStream+DataOutputStream 2.935

我们可以看到两者差距约32倍，RandomAccessFile也太慢了。先看看两者关键部分的源代码，对比分析，找出原因。
1．1．[RandomAccessFile]

复制代码代码如下:

publicclassRandomAccessFileimplementsDataOutput,DataInput{
publicfinalbytereadByte()throwsIOException{
  intch=this.read();
  if(ch<0)
   thrownewEOFException();
  return(byte)(ch);
}
publicnativeintread()throwsIOException;
publicfinalvoidwriteByte(intv)throwsIOException{
  write(v);
}
publicnativevoidwrite(intb)throwsIOException;
}

可见，RandomAccessFile每读/写一个字节就需对磁盘进行一次I/O操作。
1．2．[BufferedInputStream]
复制代码代码如下:

publicclassBufferedInputStreamextendsFilterInputStream{
privatestaticintdefaultBufferSize=2048;
protectedbytebuf[];//建立读缓存区
publicBufferedInputStream(InputStreamin,intsize){
  super(in);
  if(size<=0){
   thrownewIllegalArgumentException("Buffersize<=0");
  }
  buf=newbyte[size];
}
publicsynchronizedintread()throwsIOException{
  ensureOpen();
  if(pos>=count){
   fill();
   if(pos>=count)
    return-1;
  }
  returnbuf[pos++]&0xff;//直接从BUF[]中读取
}
privatevoidfill()throwsIOException{
if(markpos<0)
    pos=0;  /*nomark:throwawaythebuffer*/
elseif(pos>=buf.length) /*noroomleftinbuffer*/
    if(markpos>0){ /*canthrowawayearlypartofthebuffer*/
  intsz=pos-markpos;
  System.arraycopy(buf,markpos,buf,0,sz);
  pos=sz;
  markpos=0;
    }elseif(buf.length>=marklimit){
  markpos=-1; /*buffergottoobig,invalidatemark*/
  pos=0; /*dropbuffercontents*/
    }else{  /*growbuffer*/
  intnsz=pos*2;
  if(nsz>marklimit)
     nsz=marklimit;
  bytenbuf[]=newbyte[nsz];
  System.arraycopy(buf,0,nbuf,0,pos);
  buf=nbuf;
    }
count=pos;
intn=in.read(buf,pos,buf.length-pos);
if(n>0)
    count=n+pos;
}
}

1．3．[BufferedOutputStream]
复制代码代码如下:
publicclassBufferedOutputStreamextendsFilterOutputStream{
  protectedbytebuf[];//建立写缓存区
  publicBufferedOutputStream(OutputStreamout,intsize){
  super(out);
  if(size<=0){
   thrownewIllegalArgumentException("Buffersize<=0");
  }
  buf=newbyte[size];
   }
publicsynchronizedvoidwrite(intb)throwsIOException{
  if(count>=buf.length){
     flushBuffer();
  }
  buf[count++]=(byte)b;//直接从BUF[]中读取
  }
  privatevoidflushBuffer()throwsIOException{
  if(count>0){
   out.write(buf,0,count);
   count=0;
  }
  }
}

可见，BufferedI/OputStream每读/写一个字节，若要操作的数据在BUF中，就直接对内存的buf[]进行读/写操作；否则从磁盘相应位置填充buf[]，再直接对内存的buf[]进行读/写操作，绝大部分的读/写操作是对内存buf[]的操作。
1．3．小结
内存存取时间单位是纳秒级（10E-9），磁盘存取时间单位是毫秒级（10E-3），同样操作一次的开销，内存比磁盘快了百万倍。理论上可以预见，即使对内存操作上万次，花费的时间也远少对于磁盘一次I/O的开销。显然后者是通过增加位于内存的BUF存取，减少磁盘I/O的开销，提高存取效率的，当然这样也增加了BUF控制部分的开销。从实际应用来看，存取效率提高了32倍。
根据1.3得出的结论，现试着对RandomAccessFile类也加上缓冲读写机制。
随机访问类与顺序类不同，前者是通过实现DataInput/DataOutput接口创建的，而后者是扩展FilterInputStream/FilterOutputStream创建的，不能直接照搬。
2．1．开辟缓冲区BUF[默认：1024字节]，用作读/写的共用缓冲区。
2．2．先实现读缓冲。
读缓冲逻辑的基本原理：
A欲读文件POS位置的一个字节。
B查BUF中是否存在？若有，直接从BUF中读取，并返回该字符BYTE。
C若没有，则BUF重新定位到该POS所在的位置并把该位置附近的BUFSIZE的字节的文件内容填充BUFFER，返回B。
以下给出关键部分代码及其说明：
复制代码代码如下:
publicclassBufferedRandomAccessFileextendsRandomAccessFile{
// byteread(longpos)：读取当前文件POS位置所在的字节
// bufstartpos、bufendpos代表BUF映射在当前文件的首/尾偏移地址。
// curpos指当前类文件指针的偏移地址。
   publicbyteread(longpos)throwsIOException{
       if(pos<this.bufstartpos||pos>this.bufendpos){
           this.flushbuf();
           this.seek(pos);
           if((pos<this.bufstartpos)||(pos>this.bufendpos))
               thrownewIOException();
       }
       this.curpos=pos;
       returnthis.buf[(int)(pos-this.bufstartpos)];
   }
//voidflushbuf()：bufdirty为真，把buf[]中尚未写入磁盘的数据，写入磁盘。
   privatevoidflushbuf()throwsIOException{
       if(this.bufdirty==true){
           if(super.getFilePointer()!=this.bufstartpos){
               super.seek(this.bufstartpos);
           }
           super.write(this.buf,0,this.bufusedsize);
           this.bufdirty=false;
       }
   }
//voidseek(longpos)：移动文件指针到pos位置，并把buf[]映射填充至POS
所在的文件块。
   publicvoidseek(longpos)throwsIOException{
       if((pos<this.bufstartpos)||(pos>this.bufendpos)){//seekposnotinbuf
           this.flushbuf();
           if((pos>=0)&&(pos<=this.fileendpos)&&(this.fileendpos!=0))
{  //seekposinfile(filelength>0)
             this.bufstartpos= pos*bufbitlen/bufbitlen;
               this.bufusedsize=this.fillbuf();
           }elseif(((pos==0)&&(this.fileendpos==0))
||(pos==this.fileendpos+1))
{  //seekposisappendpos
               this.bufstartpos=pos;
               this.bufusedsize=0;
           }
           this.bufendpos=this.bufstartpos+this.bufsize-1;
       }
       this.curpos=pos;
   }
//intfillbuf()：根据bufstartpos，填充buf[]。
   privateintfillbuf()throwsIOException{
       super.seek(this.bufstartpos);
       this.bufdirty=false;
       returnsuper.read(this.buf);
   }
}

至此缓冲读基本实现，逐字节COPY一个12兆的文件（这里牵涉到读和写，用BufferedRandomAccessFile试一下读的速度）：

读写耗用时间（秒） RandomAccessFile RandomAccessFile 95.848 BufferedRandomAccessFile BufferedOutputStream+DataOutputStream 2.813 BufferedInputStream+DataInputStream BufferedOutputStream+DataOutputStream 2.935

可见速度显著提高，与BufferedInputStream+DataInputStream不相上下。
2．3．实现写缓冲。
写缓冲逻辑的基本原理：
A欲写文件POS位置的一个字节。
B查BUF中是否有该映射？若有，直接向BUF中写入，并返回true。
C若没有，则BUF重新定位到该POS所在的位置，并把该位置附近的BUFSIZE字节的文件内容填充BUFFER，返回B。
下面给出关键部分代码及其说明：

复制代码代码如下:
//booleanwrite(bytebw,longpos)：向当前文件POS位置写入字节BW。
//根据POS的不同及BUF的位置：存在修改、追加、BUF中、BUF外等情
况。在逻辑判断时，把最可能出现的情况，最先判断，这样可提高速度。
//fileendpos：指示当前文件的尾偏移地址，主要考虑到追加因素
   publicbooleanwrite(bytebw,longpos)throwsIOException{
       if((pos>=this.bufstartpos)&&(pos<=this.bufendpos)){
//writeposinbuf
           this.buf[(int)(pos-this.bufstartpos)]=bw;
           this.bufdirty=true;
           if(pos==this.fileendpos+1){//writeposisappendpos
               this.fileendpos++;
               this.bufusedsize++;
           }
       }else{//writeposnotinbuf
           this.seek(pos);
           if((pos>=0)&&(pos<=this.fileendpos)&&(this.fileendpos!=0))
{//writeposismodifyfile
               this.buf[(int)(pos-this.bufstartpos)]=bw;
           }elseif(((pos==0)&&(this.fileendpos==0))
||(pos==this.fileendpos+1)){//writeposisappendpos
               this.buf[0]=bw;
               this.fileendpos++;
               this.bufusedsize=1;
           }else{
               thrownewIndexOutOfBoundsException();
           }
           this.bufdirty=true;
       }
       this.curpos=pos;
       returntrue;
   }

至此缓冲写基本实现，逐字节COPY一个12兆的文件，（这里牵涉到读和写，结合缓冲读，用BufferedRandomAccessFile试一下读/写的速度）：

读写耗用时间（秒） RandomAccessFile RandomAccessFile 95.848 BufferedInputStream+DataInputStream BufferedOutputStream+DataOutputStream 2.935 BufferedRandomAccessFile BufferedOutputStream+DataOutputStream 2.813 BufferedRandomAccessFile BufferedRandomAccessFile 2.453 BufferedRandomAccessFile优 BufferedRandomAccessFile优 2.197
可见优化尽管不明显，还是比未优化前快了一些，也许这种效果在老式机上会更明显。
以上比较的是顺序存取，即使是随机存取，在绝大多数情况下也不止一个BYTE，所以缓冲机制依然有效。而一般的顺序存取类要实现随机存取就不怎么容易了。
需要完善的地方
提供文件追加功能：
复制代码代码如下:
publicbooleanappend(bytebw)throwsIOException{
       returnthis.write(bw,this.fileendpos+1);
   }

提供文件当前位置修改功能：
复制代码代码如下:
publicbooleanwrite(bytebw)throwsIOException{
       returnthis.write(bw,this.curpos);
   }

返回文件长度（由于BUF读写的原因，与原来的RandomAccessFile类有所不同）：
复制代码代码如下:
publiclonglength()throwsIOException{
       returnthis.max(this.fileendpos+1,this.initfilelen);
   }

返回文件当前指针（由于是通过BUF读写的原因，与原来的RandomAccessFile类有所不同）：
复制代码代码如下:
publiclonggetFilePointer()throwsIOException{
       returnthis.curpos;
   }

提供对当前位置的多个字节的缓冲写功能：
复制代码代码如下:
publicvoidwrite(byteb[],intoff,intlen)throwsIOException{
       longwriteendpos=this.curpos+len-1;
       if(writeendpos<=this.bufendpos){//b[]incurbuf
System.arraycopy(b,off,this.buf,(int)(this.curpos-this.bufstartpos),
len);
           this.bufdirty=true;
           this.bufusedsize=(int)(writeendpos-this.bufstartpos+1);
       }else{//b[]notincurbuf
           super.seek(this.curpos);
           super.write(b,off,len);
       }
       if(writeendpos>this.fileendpos)
           this.fileendpos=writeendpos;
       this.seek(writeendpos+1);
}
   publicvoidwrite(byteb[])throwsIOException{
       this.write(b,0,b.length);
   }

提供对当前位置的多个字节的缓冲读功能：
复制代码代码如下:
publicintread(byteb[],intoff,intlen)throwsIOException{
longreadendpos=this.curpos+len-1;
  if(readendpos<=this.bufendpos&&readendpos<=this.fileendpos){
//readinbuf
     System.arraycopy(this.buf,(int)(this.curpos-this.bufstartpos),
b,off,len);
  }else{//readb[]size>buf[]
    if(readendpos>this.fileendpos){//readb[]partinfile
       len=(int)(this.length()-this.curpos+1);
      }
      super.seek(this.curpos);
      len=super.read(b,off,len);
      readendpos=this.curpos+len-1;
  }
      this.seek(readendpos+1);
      returnlen;
}
  publicintread(byteb[])throwsIOException{
       returnthis.read(b,0,b.length);
  }
publicvoidsetLength(longnewLength)throwsIOException{
       if(newLength>0){
           this.fileendpos=newLength-1;
       }else{
           this.fileendpos=0;
       }
       super.setLength(newLength);
}

publicvoidclose()throwsIOException{
       this.flushbuf();
       super.close();
   }

至此完善工作基本完成，试一下新增的多字节读/写功能，通过同时读/写1024个字节，来COPY一个12兆的文件，（这里牵涉到读和写，用完善后BufferedRandomAccessFile试一下读/写的速度）：

猜你喜欢

【题解】吃奶酪(剪枝优化+状态压缩)
Java ThreadLocal （Java代码实战-006）详解编程语言
深入理解Oracle块结构（oracle块结构）
Flink SQL 优化
Redis中的多线程实践（redis里有几个线程）
普通人眼中的 AI 是什么样子？
Oracle关键字汇总：打开数据库之门（oracle关键字大全）
Linux指令CP:使用技巧与操作方法（linuxcp使用方法）
深入浅出：使用宝塔Linux面板（宝塔linux面板使用）
系统快速简易安装：批量安装Linux系统（批量安装linux）
sqlite数据库可视化工具—— DB.Browser安装说明
迄今最精确完整新冠病毒基因注释图谱完成
安利一款Python开发的仿Linux树形显示目录tree命令「建议收藏」
进程和线程的区别(超详细)

zl程序教程

当前栏目

深入分析:用1K内存实现高效I/O的RandomAccessFile类的详解

相关文章