ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

poi生成excel大数据量的合并单元格操作优化

2021-08-03 12:34:09  阅读:237  来源: 互联网

标签:sheet 单元格 excel addMergeRegion getKey CellRangeAddress 数据量 entry new


poi生成excel大数据量的合并单元格操作优化

之前根据官网给的api和example,对excel单元格的合并操作使用下方的代码(poi版本 3.12)

public class MergedCells {
   public static void main(String[] args) throws IOException {
        HSSFWorkbook wb = new HSSFWorkbook();
        HSSFSheet sheet = wb.createSheet("new sheet");

        HSSFRow row = sheet.createRow(1);
        HSSFCell cell = row.createCell(1);
        cell.setCellValue("This is a test of merging");

        sheet.addMergedRegion(new CellRangeAddress(1, 1, 1, 2));

        // Write the output to a file
        FileOutputStream fileOut = new FileOutputStream("workbook.xls");
        wb.write(fileOut);
        fileOut.close();
        wb.close();
    }
}

当使用支持07以上的XSSFWorkbook和SXSSFWorkbook的时候同样可以这样,代码如下:

private static void mergeWithXSSF() throws  IOException{
       XSSFWorkbook wb = new XSSFWorkbook();
       XSSFSheet sheet = wb.createSheet("new sheet");

       XSSFRow row = sheet.createRow(1);
       XSSFCell cell = row.createCell(1);
       cell.setCellValue("This is a test of merging");
       sheet.addMergedRegion(new CellRangeAddress(1, 1, 1, 2));
        
       // Write the output to a file
       FileOutputStream fileOut = new FileOutputStream("d:/temp/workbook1.xlsx");
       wb.write(fileOut);
       fileOut.close();
       wb.close();
 }

但是当mergeCell的次数非常大(数万~数十万)时候,对cpu和内存消耗不仅大大增加,而且耗时也非常大,本地测试的时候,循环次数6W次,大约需要20-30分钟;耗时核心代码如下:

for(int i=0;i<100000;i++) {
      XSSFRow row = sheet.createRow(i);
      XSSFCell cell = row.createCell(1);
      cell.setCellValue("This is a test of merging");
      sheet.addMergedRegion(new CellRangeAddress(i, i, 1, 2));
  }

官网版本是3.16以后,我更新后发现 sheet 多了一个合并单元格的方法 addMergedRegionUnsafe,比addMergedRegion少了一些检测异常的过程

源码如下:

public int addMergedRegion(CellRangeAddress region) {
        return this.addMergedRegion(region, true);
    }

    public int addMergedRegionUnsafe(CellRangeAddress region) {
        return this.addMergedRegion(region, false);
    }

    private int addMergedRegion(CellRangeAddress region, boolean validate) {
        if(region.getNumberOfCells() < 2) {
            throw new IllegalArgumentException("Merged region " + region.formatAsString() + " must contain 2 or more cells");
        } else {
            region.validate(SpreadsheetVersion.EXCEL2007);
            if(validate) {
                this.validateArrayFormulas(region);
                this.validateMergedRegions(region);
            }

            CTMergeCells ctMergeCells = this.worksheet.isSetMergeCells()?this.worksheet.getMergeCells():this.worksheet.addNewMergeCells();
            CTMergeCell ctMergeCell = ctMergeCells.addNewMergeCell();
            ctMergeCell.setRef(region.formatAsString());
            return ctMergeCells.sizeOfMergeCellArray();
        }
    }

使用这个方法之后,对于十万条数据合并单元格的本地测试就降低到了30多秒,感觉真的是质的飞跃,非常高兴,但是这只是开始,我想到既然通过减少了一些异常检测就有如此神威,是否合并单元格的方法还可以继续缩减呢?

合并单元格的核心代码在这:

//判断sheet是否已经有过合并单元格的经历,如果有就getMergeCells得到ctMergeCells 对象,否则就从addNewMergeCells获取对象(因为isSetMergeCells中使用了锁,还有一些复杂的操作,感觉会比较耗时)
CTMergeCells ctMergeCells = this.worksheet.isSetMergeCells()?this.worksheet.getMergeCells():this.worksheet.addNewMergeCells();
CTMergeCell ctMergeCell = ctMergeCells.addNewMergeCell();
ctMergeCell.setRef(region.formatAsString());
return ctMergeCells.sizeOfMergeCellArray();

自己控制是否合并了单元格。设置一个本地变量 int mergeCellsCount = 0;如果合并了单元格 mergeCellsCount ++; 刚才核心的代码可以改成

CTMergeCells ctMergeCells = mergeCellsCount >0 ?this.worksheet.getMergeCells():this.worksheet.addNewMergeCells();
CTMergeCell ctMergeCell = ctMergeCells.addNewMergeCell();
ctMergeCell.setRef(region.formatAsString());
mergeCellsCount  ++
return ctMergeCells.sizeOfMergeCellArray();

return每次合并单元格的时候还要返回 已经 合并的单元格 的数目(我们上方定义的mergeCellsCount )

我们只要合并的过程,这个计数对我们没用,所以,自己重写了以下addMergedRegion方法

private void addMergeRegion(CellRangeAddress cra) {
		XSSFWorkbook workbook = new XSSFWorkbook();
		XSSFSheet sheet = workbook.getSheetAt(0);
		CTWorksheet ctWorksheet = sheet.getCTWorksheet();

		CTMergeCells ctMergeCells = mergeCellsCount > 0 ?ctWorksheet.getMergeCells():ctWorksheet.addNewMergeCells();
		CTMergeCell ctMergeCell = ctMergeCells.addNewMergeCell();

		ctMergeCell.setRef(cra.formatAsString());
		mergeCellsCount ++;
	}

总结一下:主要是CellRangeAddress 是自己定义的,自己会控制合并区域的单元格的合法性,所以去掉验证合法性的代码; 去掉返回count的代码

以上转载自:https://my.oschina.net/u/1417838/blog/edit

--------------------------------------------------------------------------------分割线--------------------------------------------------------------------------------------------

然后我自己的业务场景是有大量的合并操作,我把要合并的参数保存到map中,然后再开线程跑(我是用SXSSFWorkbook 操作的,该方法操作大数据可观)

//maps用来存要合并的参数
System.out.println("总数"+maps.size());
        if (maps.size() > 0){
			ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
			// 设置核心线程数
			executor.setCorePoolSize(8);
			// 设置最大线程数
			executor.setMaxPoolSize(8);
			// 设置队列容量
			executor.setQueueCapacity(350);
			// 设置线程活跃时间(秒)
			executor.setKeepAliveSeconds(60);
			// 等待所有任务结束后再关闭线程池
			executor.setWaitForTasksToCompleteOnShutdown(true);
			// rejection-policy:当pool已经达到max size的时候,如何处理新任务
			// CALLER_RUNS:不在新线程中执行任务,而是有调用者所在的线程来执行
			executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
			executor.initialize();
			int t = Runtime.getRuntime().availableProcessors();
			System.out.println("当前可用的线程数:"+t);

			BlockingQueue<List<Map<Integer, Integer>>> queue = new ArrayBlockingQueue(maps.size());
			//以先入先出的顺序排序的阻塞队列
			for (Map.Entry<Integer, Integer> entry : maps.entrySet()) {
				List<Map<Integer, Integer>> listMap = new ArrayList<>();
				Map<Integer, Integer> mapList = new HashMap<>();
				mapList.put(entry.getKey(), entry.getValue());
				listMap.add(mapList);
				queue.add(listMap);
			}
			CountDownLatch latch = new CountDownLatch(queue.size());
			int rowNum = 0;
			while (queue.size()>0){
				try {
					System.out.println(rowNum);
					rowNum++;
					executor.execute(new MergedRegionForThread(latch,queue.take(),sheet));
				} catch (InterruptedException e) {
					e.printStackTrace();
				}
			}
			try {
				latch.await();
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
		}
/**
 * 多线程合并表格
 */
public class MergedRegionForThread implements Runnable{

    private CountDownLatch latch;
    private List<Map<Integer, Integer>> dataList;
    private int mergeCellsCount = 0;
    private SXSSFSheet sheet;

    public MergedRegionForThread(CountDownLatch latch, List<Map<Integer, Integer>> dataList, SXSSFSheet sheet) {

        this.latch = latch;
       this.dataList = dataList;
        this.sheet = sheet;
    }

    @Override
    public void run() {
        try {
            Map<Integer, Integer> map = (Map<Integer, Integer>) dataList.get(0);
            for (Map.Entry<Integer, Integer> entry : map.entrySet()) {
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 0, 0));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 1, 1));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 2, 2));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 3, 3));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 4, 4));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 5, 5));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 6, 6));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 7, 7));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 8, 8));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 9, 9));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 10, 10));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 11, 11));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 12, 12));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 13, 13));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 14, 14));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 15, 15));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 16, 16));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 17, 17));
                addMergeRegion(sheet,new CellRangeAddress(entry.getKey(), entry.getValue(), 39, 39));
            }
        }catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (null != latch) {
                latch.countDown();
            }
        }
    }

    /**
     * 根据源码自定义表格合并方法(去掉验证和判断),只提取合并方法
     * @param sheet
     * @param cra
     */
    private void addMergeRegion(SXSSFSheet sheet,CellRangeAddress cra) {
        SXSSFWorkbook workbook = sheet.getWorkbook();
        XSSFWorkbook xssfWorkbook = workbook.getXSSFWorkbook();
        //获取第一个sheet
        XSSFSheet sheet1 = xssfWorkbook.getSheetAt(0);
        CTWorksheet ctWorksheet = sheet1.getCTWorksheet();

        CTMergeCells ctMergeCells = mergeCellsCount > 0 ?ctWorksheet.getMergeCells():ctWorksheet.addNewMergeCells();
        CTMergeCell ctMergeCell = ctMergeCells.addNewMergeCell();

        ctMergeCell.setRef(cra.formatAsString());
        //用来判断sheet是否已经有合并表格
        mergeCellsCount ++;
    }
}

亲测,本来合并要2多个小时以上,使用后20s,30s即刻完成。有错欢迎大家指出

标签:sheet,单元格,excel,addMergeRegion,getKey,CellRangeAddress,数据量,entry,new
来源: https://blog.csdn.net/weixin_43998503/article/details/119347113

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有