zl程序教程

您现在的位置是:首页 >  后端

当前栏目

python域名分析工具实现代码

Python域名工具代码 实现 分析
2023-06-13 09:14:11 时间
代码如下:
复制代码代码如下:

importsys,urllib
importdatetime,time
defgetDate():
strday=datetime.datetime.now().__str__()
strday=strday.split()[0]
returnstrday
#url="http://www.kingnic.com/list/2009-06-16.txt"
defgetUrl(dateStr=None):
baseUrl="http://www.kingnic.com/list/"
ifdateStr:
returnbaseUrl+dateStr+".txt"
thisDate=getDate();
ifnotthisDate:
print"ErrorDate!"
returnNone;
url=baseUrl+thisDate+".txt"
returnurl
defgetSource(url):
source=urllib.urlopen(url).read()
returnsource

defsave(source,filename="domains.txt"):
fp=open(filename,"w")
fp.write(source)
fp.close()
returnTrue;
defloadList(fileName="domains.txt"):
fp=open("domains.txt","r")
source=fp.readlines()
fp.close()
returnsource;
defgetPrefix(domain):
returndomain.split(".")[0]
defgetPostfix(domain):
returndomain.split(".")[1]
defhasMidLine(domain):
if"-"indomain:
returnTrue
else:
returnFalse
defparser(domains):
max=4
min=0
keyword=("sky","see","job")
result=[]

len_num=0;
mid_line_num=0;

fordomainindomains:
prefix=getPrefix(domain)
postfix=getPostfix(domain)
domainlen=len(prefix)
if(domainlen<min)or(domainlen>max):
len_num+=1
continue
ifhasMidLine(prefix):
mid_line_num+=1
continue
result.append(domain)

print"log:\n"
print"all:\t",len(domains)
print"lennotin[%s,%s]\t:%s"%(max,min,len_num)
print"contain"-":\t",mid_line_num
print"remain:\t",len(result)
returnresult;

if__name__=="__main__":
url=getUrl()
source=getSource(url)
save(source)
domains=loadList()
result=parser(domains)
save("".join(result),"result.txt")
print("\n\n\nfinished!!")

输出文件:
domains.txt:kingnic.com据当天释放的域名;
result.txt    :符合过滤条件的域名;
log输出:
复制代码代码如下:

all:55500
lennotin[4,0]:55019
contain"-":32
remain:449
finished!!

对后缀、长度和有无“-”过滤,过滤条件有点少,其它以后如有需要再加。