您现在的位置是：首页 > Java

当前栏目

一个用基于Java语言编写的词法分析器代码的自动生成程序，模仿lex程序的需求应用设计 DokymeLex

2023-02-19 12:20:14 时间

推荐理由：一个用基于Java语言编写的词法分析器代码的自动生成程序，模仿lex程序的需求应用设计完成 DokymeLex，Language files blank comment code，Java 13 130 119 1176，SUM: 13 130 119 1176，概述，这是一个模仿Lex程序功能的词法分析器代码生成程序，简称“编译器的编译器”。该程序能够读取由用户定义的.dkm文件，分析该文件中的声明、正规定义、规则并生成能够通过JVM运行的JAVA的词法分析器源代码。Lex简介，Lex helps write programs whose control flow is directed by instances of regular expressions in the inp

适用人群：编程

推荐指数：5

项目名称：DokymeLex

996station正文分割线=================================

DokymeLex

Language files blank comment code

Java 13 130 119 1176

SUM: 13 130 119 1176

概述

这是一个模仿Lex程序功能的词法分析器代码生成程序，简称“编译器的编译器”。该程序能够读取由用户定义的.dkm文件，分析该文件中的声明、正规定义、规则并生成能够通过JVM运行的JAVA的词法分析器源代码。

Lex简介

Lex helps write programs whose control flow is directed by instances of regular expressions in the input stream. It is well suited for editor-script type transformations and for segmenting input in preparation for a parsing routine. Lex source is a table of regular expressions and corresponding program fragments. The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. As each such string is recognized the corresponding program fragment is executed. The recognition of the expressions is performed by a deterministic finite automaton generated by Lex. The program fragments written by the user are executed in the order in which the corresponding regular expressions occur in the input stream.

The lexical analysis programs written with Lex accept ambiguous specifications and choose the longest match possible at each input point. If necessary, substantial lookahead is performed on the input, but the input stream will be backed up to the end of the current partition, so that the user has general freedom to manipulate it.

Lex can generate analyzers in either C or Ratfor, a language which can be translated automatically to portable Fortran. It is available on the PDP-11 UNIX, Honeywell GCOS, and IBM OS systems. This manual, however, will only discuss generating analyzers in C on the UNIX system, which is the only supported form of Lex under UNIX Version 7. Lex is designed to simplify interfacing with Yacc, for those with access to this compiler-compiler system.

完整食用说明

确保你的电脑上有jdk1.8及以上版本，没有的话，我也无能为力咯。。。
在某个文件夹下创建一个lex文件，并在其中写一些东西，具体怎么写参照.lex文件结构。假设路径为 `C:\dokyme.lex`。
在console中运行DokymeLex_{i386或x64}.exe（以下简称DokymeLex.exe）,注意要带有参数 -l（是小写L不是1不是i）。可以使用参数 -h 查看所有参数。 .\DokymeLex.exe -l C:\dokyme.lex
程序会自动读取lex文件中的声明、正规定义、函数定义，并生成可执行的、仅包含有一个类的.java源程序文件。默认生成文件名为 DokymeLex.java，默认路径为应用程序所在路径，如果想要自定义请直接使用 -h 参数查阅需要的参数。 注：请自觉将文件名命名为DokymeLexer.java文件，否则文件名和主类名不匹配，javac没办法编译的。 程序运行所需的时间取决于lex文件的复杂度，再加上我比较菜，所以请不要写太过复杂的lex，比如ANSI C这样的lex定义。
生成的java源文件中默认的包名为com。因此请新建一个com文件夹，把生成的.java文件放到com文件夹下。当然你也可以自己修改包名，并让包名和目录结构匹配。 javac com/DokymeLexer.java
然后运行生成的.class文件，注意同样需要一些参数，同样可以使用 -h 查看帮助文档。假设需要做词法分析的文件为wenwen.txt（实际情况下，一般都是对某种语言的源文件进行词法分析，比如.c、.java、.py这样的，这里就以txt为例了）。 java com.DokymeLexer wenwen.txt
程序会打印分析出的token序列。至此结束。

.lex文件结构

实际上文件的拓展名并没有具体要求，可以是.lex、.txt或者其他，只要在运行的时候指明文件完整路径即可。

正规定义

由正则表达式定义。所有定义的类型都将直接替换后文中声明段和规则段。相当于c语言中的#define。

声明段

声明一些必须的成员变量，这些成员变量其实是全局可访问的，因为最后生成的主程序只有一个类。声明段的代码会被直接复制到词法分析器主类的定义中。

规则段

匹配字符串模式，根据规则采取行为。规则段中的代码会被直接复制到相应状态下。

程序段

主程序代码。代码段中包含的函数都将直接复制带词法分析器主类的定义中。

示例如下:

示例.lex文件

[a-zA-Z]    {letter}[0-9]       {digit}letter(letter|digit)*   {id}digit+      {number}[!@#$%^&\*\(\)_\[\]{}?\+:;,.]  {symbol}[ \r\n] {blank}%%private int counter = 0;%%if    {increment();return "IF";}else  {return "ELSE";}while {return "WHILE";}for   {return "FOR";}switch  {return "SWITCH";}case  {return "CASE";}break {return "BREAK";}blank   {return "BLANK";}int   {return "INT";}float {return "FLOAT";}char  {return "CHAR";}bool  {return "BOOL";}void  {return "VOID";} static  {return "STATIC";}return {return "RETURN";} id      {return "ID";}number  {return "CONSTANT";}"(symbol|digit|letter)+"    {return "CONSTANT";};   {return "SEMI";} =     {return "ASN";}\+     {return "ADD";}\-     {return "SUB";}\*     {return "MUL";}/     {return "DIV";}%     {return "MOD";}>     {return "BT";}<     {return "ST";}==    {return "EQU";}\+=    {return "ADDA";}\-=    {return "SUBA";}\*=    {return "MULA";}/=    {return "DIVA";}%=    {return "MODA";}\+\+    {return "INC";}\-\-    {return "DEC";}"     {return "QUO";}{     {return "LBRCE";}}     {return "RBRCE";}\[     {return "LBRCKT";}\]     {return "RBRCKT";}\(     {return "LPTH";}\)     {return "RPTH";}.     {return "DOT";} \|\|    {return "OR";}&&    {return "AND";}!     {return "NOT";}%%private void increment(){    counter++;}public static void main(String[] args){    new DokymeLexer(args);} %%private void increment(){   counter++;}

开源地址

https://github.com/Kherrisan/DokymeLex

开源是一种精神，致敬屏幕背后的你！

=================================

原文链接：https://www.996station.com/216773

转载请注明出处！

猜你喜欢

反作弊软件polar.top—让玩家享受公平的游戏环境
[JCIM | 论文简读] 低资源反应预测场景的自监督分子预训练策略
[AAAI 2022 | 论文简读] 通过多样化和交互式信息传递的自监督图神经网络
[KDD 2022 | 论文简读] KPGT: 用于分子性质预测的知识指导的预训练图形变换模型
[Genome Biology | 论文简读] 通过解释深度学习模型识别癌症的常见转录组特征
智能装备领军企业—lead.top
[AAAI | 论文简读] GNN中非属性节点分类的优先标记
[Arxiv | 论文简读] 对比聚类
[nature genetics | 论文简读] 用序列模型从染色体角度来预测3D基因组结构
智能化医疗设备售后服务平台yxb.top
[BIB | 论文简读] 通过具有关注机制的关系图卷积网络预测细胞系的协同药物组合
[NMI | 论文简读] 基于语言模型的可控蛋白质设计
[Bioinformatics | 论文简读] GraphLoc：一种从免疫组织化学图像中预测蛋白质亚细胞定位的图神经网络模型
[NC | 论文简读] PlasmidMaker是用于质粒构建的多功能、自动化和高通量的端到端平台
[CVPR 2022 | 论文简读] 视觉语言表征学习的统一Transformer
[IJCAI 2022 | 论文简读] 基于知识蒸馏的持续联邦学习
[CVPR | 论文简读] 基于双交叉注意学习的细粒度视觉分类和对象再识别
[JCIM | 论文简读] Chemspace Atlas：用于药物发现的多尺度大型化学数据库
CVE-2019-2729 Weblogic 反序列化漏洞
[Nature Machine Intelligence | 论文简读] 利用基于注意力的神经网络映射化学反应的空间

zl程序教程