zl程序教程

您现在的位置是:首页 >  后端

当前栏目

Python excel转换为table 主要是合并单元格问题解决

PythonExcel转换 解决 Table 合并 主要 单元格
2023-09-11 14:22:26 时间

需求

我们公司最近又开始搞这些神奇的需求了

把excel表格 转换为 html里面的table,数量估计有几百个,分了我几十个,尼玛,不想像他们那样一个一个手搓,伤不起。。。
决定用python解决,后面附上代码
隔壁那女的更傻,以复制粘贴的多为荣,只想说 呵呵

代码随便写的几句,一边百度一边写的,主要就是一些逻辑判断罢了





原始table






处理后的table






列的合并处理

项次1 检验项目3 规定值或允许偏差3 检验结果2 检验方法和频率3 权值1

因为给的模板的列 实际 他们在html里面画的table的列 数量不对,
这里需要重新调整过,后面要是有还有很多,可以在程序里面调整



模板页面

模板页面主要是定位了几个占位符,替换一下 标题 和 table的核心内容


@{
    ViewBag.Title = "tmp_title";
    Layout = "~/Views/Shared/_Form.cshtml";
}

<body>
    <style media="print">
        @@page {
            size: portrait;
            margin-top: 20mm;
            margin-left: 20mm;
            margin-bottom: 15mm;
            margin-right: 15mm;
        }

        .no-print {
            display: none !important;
        }

        .print-show {
            display: block !important;
        }
        /*@@page {
                size: landscape;
                margin-top:20mm;
                margin-left:15mm;
                margin-bottom:15mm;
                margin-right:15mm;
            }*/

    </style>
    <style>
        table {
            border-collapse: collapse;
            table-layout: fixed;
            text-align: center;
            margin: 0 auto;
            word-break: break-all;
        }

        input {
            width: 95%;
            height: 95%;
            border: none;
            text-align: center;
            font-size: 16px;
        }

        table span {
            display: inline-block;
            font-size: 17px;
        }

        textarea {
            width: 92%;
            height: 92%;
            resize: none;
            border: none;
            font-size: 16px;
            text-align: center;
            overflow: hidden
        }

        table tr:not(:nth-child(-n+5)) {
            border-bottom: 1px solid black;
            border-left: 3px solid black;
            border-right: 3px solid black;
        }


        table tr td {
            height: 30px;
            border-left: 1px solid black;
            border-bottom: none;
            border-top: none;
            font-size: 16px;
        }

        input[type=checkbox] {
            width: 17px;
            height: 17px;
            z-index: -1;
            -webkit-appearance: none;
            outline: none;
        }

            input[type=checkbox]:after {
                height: 100%;
                width: 100%;
                top: 0;
                content: "";
                /*content:url(../../../../Content/NewTableImg/20200410192843803.png);*/
                background: white;
                color: #000000;
                display: inline-block;
                visibility: visible;
                border-radius: 2px;
                border: 1px solid black;
            }

            input[type=checkbox]:checked:after {
                content: "✓";
                text-align: center;
                font-weight: bolder;
                height: 100%;
                width: 100%;
                font-size: 20px;
                position: relative;
                line-height: 17px;
            }
    </style>
    <form id="TableList" action="LowVoltageDistributionEquipmentBGD">
        <table id="TableLists" style="width:800px">
            <tr>
                <td colspan="13" style="text-align:center; border-top:none;border-right:none;border-left:none;border-bottom:0px solid black;height:35px;line-height:35px;"><input id="Projects" type="text" style="text-align:center;width:100%;border:none;font-size:19px"></td>
            </tr>
            <tr>
                <td colspan="13" style=" border:none;text-align:center;height:35px;line-height:35px;">
                    <span style="font-size:23px">
tmp_title                
                    </span>
                </td>
            </tr>
            <tr>
                <td colspan="2" style=" border:none;text-align:right;height:35px;line-height:35px;"><span style="font-weight:500">承包单位:</span></td>
                <td colspan="5" style=" border:none;border-bottom:1px solid black;height:35px;line-height:35px;"><input id="OrgConstruction" type="text" style="width:100%;text-align:left;font-weight:500;font-size: 17px;" /></td>
                <td colspan="2" style=" border:none;height:35px;line-height:35px;"></td>
                <td colspan="1" style=" border:none;text-align:right;height:35px;line-height:35px;"><span style="font-weight:500">合同号:</span></td>
                <td colspan="3" style=" border:none;border-bottom:1px solid black;height:35px;line-height:35px;"><input id="ContractCode" type="text" style="width:100%;text-align:left;font-weight:500;font-size: 17px;" /></td>
            </tr>
            <tr>
                <td colspan="2" style=" border:none;text-align:right;height:35px;line-height:35px;"> <span style="font-weight:500;">监理单位:</span></td>
                <td colspan="5" style=" border:none;border-bottom:1px solid black;height:35px;line-height:35px;"><input id="OrgSupervision" type="text" style="width:100%;text-align:left;font-weight:500;font-size: 17px;" /></td>
                <td colspan="2" style=" border:none;height:35px;line-height:35px;"></td>
                <td colspan="1" style=" border:none;text-align:right;height:35px;line-height:35px;"> <span style="font-weight:500">编  号:</span></td>
                <td colspan="3" style=" border:none;border-bottom:1px solid black;height:35px;line-height:35px;"><input id="Code" type="text" style="width:100%;text-align:left;font-weight:500;font-size: 17px;" /></td>
            </tr>
            <tr>
                <td colspan="9" style="border:none;height:35px;line-height:35px;"></td>
                <td colspan="4" style="width:100%;text-align:right;border:none;height:35px;line-height:35px;"><span>C-1</span></td>
            </tr>
            <tr style="border-top:3px solid black;height:60px">
                <td colspan="2">工程名称</td>
                <td colspan="5" style="white-space:normal;word-break:break-all;"><span class="print-show" style="display:none;"></span><textarea rows="1" class="no-print" id="UseSite" name="UseSite" style="text-align: center;overflow:hidden;word-break: break-all;height:auto;max-height:92%;"></textarea></td>
                <td colspan="2">施工时间</td>
                <td colspan="4"><input type="text" id="ConstructionDate" name="ConstructionDate" value="" /></td>
            </tr>
            <tr style="height:60px">
                <td colspan="2">桩号及部位</td>
                <td colspan="5" style="white-space:normal;word-break:break-all;"><span class="print-show" style="display:none;"></span><textarea rows="1" class="no-print" id="EngineerName" name="EngineerName" style="text-align: center;overflow:hidden;word-break: break-all;height:auto;max-height:92%;"></textarea></td>
                <td colspan="2">检验时间</td>
                <td colspan="4"><input type="text" id="TestDate" name="TestDate" value="" /></td>
            </tr>
            <tr>
                <td colspan="1" style="text-align:center">项次</td>
                <td colspan="3" style="text-align:center">检查项目</td>
                <td colspan="3">规定值或允许偏差</td>
                <td colspan="2" style="text-align:center">检查结果</td>
                <td colspan="3" style="text-align:center">检查方法和频率</td>
                <td colspan="1" style="text-align:center">权值</td>
            </tr>



            <!-- 开始-->
            tmp_content
            <!-- 结束-->




            <tr style="border-bottom:none">
                <td colspan="13" style="text-align: left;border-bottom: none" valign="top">自检结论:</td>
            </tr>
            <tr style="border-top:none">
                <td colspan="13" style="text-align:center;border-top:none;height:80px">
                    <input id="resut" name="resut" />
                </td>
            </tr>
            <tr>
                <td colspan="2">质检负责</td>
                <td colspan="2">
                    <input type="text" style="text-align:left;border:none" name="ZJFZ" readonly="readonly" id="ZJFZ" value="" />
                </td>
                <td colspan="2">检查</td>
                <td colspan="2">
                    <input type="text" style="text-align:left;border:none" name="JC" readonly="readonly" id="JC" value="" />
                </td>
                <td colspan="2">复核</td>
                <td colspan="3">
                    <input type="text" style="text-align:left;border:none" name="FH" readonly="readonly" id="FH" value="" />
                </td>
            </tr>
            <tr style="border-bottom:none">
                <td colspan="13" style="text-align:left;border-bottom:none" valign="top">监理意见:</td>
            </tr>
            <tr style="border-bottom:none;border-top:none;height:60px">
                <td>
                    <textarea id="opinion" name="opinion"></textarea>
                </td>
            </tr>
            <tr style="border-top:none;border-bottom:3px solid black">
                <td colspan="3" style="border:none"></td>
                <td colspan="3" style="border:none">专业监理工程师:</td>
                <td colspan="4" style="border:none">
                    <input type="text" style="text-align:left;border:none;height:24px;line-height:24px;" name="JL" readonly="readonly" id="JL" value="" />
                </td>
                <td style="border: none;text-align:right"><input type="text" style="width:100%;text-align:right;height:24px;line-height:24px;" id="Date1" name="Date1" value="" /></td>
                <td style="border: none;text-align:right"><input type="text" style="width:100%;text-align:right;height:24px;line-height:24px;" id="Date2" name="Date2" value="" /></td>
                <td style="border: none;text-align:right"><input type="text" style="width:100%;text-align:right;height:24px;line-height:24px;" id="Date3" name="Date3" value="" /></td>
            </tr>

        </table>
    </form>

</body>




控制器方法的生成






用于记录页面和表对应关系的excel生成

主要用到了 excel的分列 和 字符串的拼接

分列

字符串的拼接
=CONCATENATE("/NewTable/SurveyReport/",C1)








python代码

这些库是之前就安装好的
后面就安装了一个 xpinyin 这个


import os
import openpyxl
from openpyxl import Workbook
from copy import deepcopy
from openpyxl.utils import get_column_letter
from xpinyin import Pinyin
import re



workbook2 = Workbook()
strHtml_tmp = ''
strTxt_filename = ''
strTxt_methodname = ''



def in_area(arr_area, row, col):# 判断是否在区域中
    for area in arr_area:
        if row >= area['r1'] and row <= area['r2'] and col >= area['c1'] and col <= area['c2']:
            #print('in_area')
            #print(area)            
            return area
    #print('not_in_area')
    return None


def is_first(area, row, col):# 判断是否是区域的第一个单元格
    min_row = area['r1']  if area['r1'] < area['r2'] else area['r1']
    min_col = area['c1']  if area['c1'] < area['c2'] else area['c2']    
    if row==min_row and col==min_col:
        #print('is_first')
        #print('min_row,min_col', min_row, min_col)    
        return True
    #print('not_first')
    return False


p = Pinyin()
def get_en_name(cn):# 获取英文名    
    en = p.get_initials(cn, '')
    return en









def read_worksheet(path):

    #path='test1.xlsx'
    workbook = openpyxl.load_workbook(path)# 加载excel
    name_list = workbook.sheetnames# 所有sheet的名字
    worksheet = workbook[name_list[0]]# 读取第一个工作表

    # 获取所有 合并单元格的 位置信息
    # 是个可迭代对象,单个对象类型:openpyxl.worksheet.cell_range.CellRange
    # print后就是excel坐标信息
    m_list = worksheet.merged_cells

    l = deepcopy(m_list)# 深拷贝
    arr_area = []
    
    # 拆分合并的单元格 并填充内容
    for m_area in l:
        # 这里的行和列的起始值(索引),和Excel的一样,从1开始,并不是从0开始(注意)
        r1, r2, c1, c2 = m_area.min_row, m_area.max_row, m_area.min_col, m_area.max_col
        worksheet.unmerge_cells(start_row=r1, end_row=r2, start_column=c1, end_column=c2)
        # print('区域:', m_area, '  坐标:', r1, r2, c1, c2)
        arr_area.append({'r1':r1,'r2':r2,'c1':c1,'c2':c2,'m_area':str(m_area)})        
    
    strHtml = ''    
         
    # global strTxt_filename
    # print('max_row,max_column', worksheet.max_row, worksheet.max_column)
    # strTxt_filename += path + '\n'
    # strTxt_filename += str(worksheet.max_row) + ',' + str(worksheet.max_column)
    # strTxt_filename += '\r\n'
    # return
    
    # 遍历行何列 坐标从0开始
    for x in range(worksheet.max_row):
        strHtml += '<tr>'
        strHtml += '\n'
        for y in range(worksheet.max_column):
            row = x + 1
            col = y + 1
            cellValue = worksheet.cell(row, col).value
            # print('单元格内容:', cellValue)
            area = in_area(arr_area, row, col)
            if not area is None:# 是合并单元格
                if is_first(area, row, col):
                    rowspan = area['r2']-area['r1']+1
                    colspan = area['c2']-area['c1']+1
                    # print('合并单元格:', area)
                    # print('rowspan,colspan:', rowspan, colspan)
                    # print('row,col,cellValue:', row, col, cellValue)                        
                    str_cellValue = cellValue if not cellValue is None and not str(cellValue).isspace() else '<input style="text-align:center" value="" />'                        
                    strHtml += '<td rowspan="{}" colspan="{}" style="white-space: pre-wrap;">{}</td>'.format(rowspan, colspan, str_cellValue)
                    strHtml += '\n'
                    # print(strHtml)
                else:
                    # print('在合并单元格中,跳过')
                    a = 0
            else:# 非合并单元格
                # print('单独的单元格')
                str_cellValue = cellValue if not cellValue is None and not str(cellValue).isspace() else '<input style="text-align:center" value="" />'                        
                strHtml += '<td colspan="1" style="white-space: pre-wrap;">{}</td>'.format(str_cellValue)
                strHtml += '\n'
        strHtml += '</tr>'
        strHtml += '\n'
    
    return strHtml
    # # 保存文件
    # file=open(html_save_path, 'w', encoding = "utf-8") 
    # file.write(strHtml_tmp.replace('tmp_title',tmp_title).replace('tmp_content', strHtml)); 
    # file.close()     



def show_files(path, all_files):
    file_list = os.listdir(path)
    for file in file_list:
        cur_path = os.path.join(path, file)
        basename = os.path.basename(path)
        if os.path.isdir(cur_path):# delete dir
            show_files(cur_path, all_files)
        else:# file            
            parent_dir = path.replace('C-1','')
            last_index = parent_dir.rfind('\\')
            parent_dir = parent_dir[last_index+1: ]                        
            print('cur_path', cur_path)
            # print('path', path)
            # print('file', file)
            # 得到新文件名
            filename_cn_center = re.findall(re.compile(r'[(](.*?)[)]', re.S), file)[0].replace('、','')
            filename_en = get_en_name(parent_dir)+'_'+get_en_name(filename_cn_center) 
            # print('filename_en', filename_en)
            
            global strTxt_filename
            global strTxt_methodname
            strTxt_filename += str(file)+','+filename_en+'\n'
            strTxt_methodname += '// '+str(file)+'\n public ActionResult '+filename_en+'(){return View();}\r\n'
            
            html_save_path = 'e:\\html\\{}.cshtml'.format(filename_en)
            tmp_title = filename_cn_center+'现场质量检验报告单'
            tmp_content = read_worksheet(cur_path)
            
            # 保存文件
            file=open(html_save_path, 'w', encoding = "utf-8") 
            html=str(strHtml_tmp)
            html=html.replace('tmp_title',tmp_title)
            html=html.replace('tmp_content',tmp_content)
            file.write(html);
            file.close()   
            
    return all_files




# 读取模板
f = open("e:\\tmp_html.cshtml", encoding = "utf-8")
strHtml_tmp = f.read()
f.close()

contents = show_files("E:\\111", [])


# 保存文件
file=open("e:\\strTxt_filename.txt", 'w', encoding = "utf-8") 
file.write(strTxt_filename); 
file.close() 

file=open("e:\\strTxt_methodname.txt", 'w', encoding = "utf-8") 
file.write(strTxt_methodname); 
file.close() 






# html='tmp_title 间隔 tmp_content'
# html=html.replace('tmp_title', '开始')
# html=html.replace('tmp_content', '结束')
# print(html)



# 代码的核心逻辑---------------------

# 遍历行
# 遍历列
# 单元格 在不在区域中
# 是否是 区域的开头
# 如果是开头,则用区域。如果不是开头,则跳过

# 核心判断
# 判断一个单元格(行和列) 是否在区域中
# 判断一个单元格 是否是区域的开头



# 13个列
# 15个列


# strHtml_tmp = ''
# f = open("test1.cshtml",encoding = "utf-8")
# strHtml_tmp = f.read()
# f.close()


# print('arr_area------------')
# print(len(arr_area))
# print(arr_area)

# area = in_area(arr_area, 5, 1)
# if not area is None:
    # is_first(area, 5, 1)        


#def each_files():    
    #pathDir =  os.listdir('./files/')
    #for index, value in enumerate(pathDir):
        #filepath2 = './files/' + value
        #print(filepath2)
        #create_worksheet(filepath2)        

#each_files()
#workbook2.save('test2.xlsx')


# https://pypi.org/project/xpinyin/
# pip install -U xpinyin


# create_worksheet('E://1-1.xlsx')


# p = Pinyin()
# res = p.get_initials("照明设施", '')
# print(res)


#string = 'abe(ac)ad)'
#string = 'C-1现场质量检验报告单(照明设施).xls'
#p1 = re.compile(r'[(](.*?)[)]', re.S) # 最小匹配