ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

Apache POI 用eventmodel 读取大文件Excel (3) Xlsx格式内容补充

2020-12-08 10:00:25  阅读:322  来源: 互联网

标签:Xlsx formulas POI Excel cell formula shared ECMA 376


本文部分内容来自于ECMA-376
http://www.ecma-international.org/publications/standards/Ecma-376.htm

在上一节中,已经了解了OOXML大部分的构成,然鹅在实际操作中,遇到了t="shared"这样的共享公式,是上一篇中没有出现过的内容,stackOverflow上查到是共享公式shared formula但是更详细的内容就没了。

在百度,谷歌,stackOverflow上搜到的描述,讲解OOXML的信息很少,导致最后不得不追根溯源,去wiki上看了一下,发现OOXML是ECMA-376制定的。

.xlsx格式出现的时间是2007年,且它遵守了这个标准,那这个标准诞生的时间必须在2007年以及之前,经过查询发现ECMA-376 1st 第一版就是2006年发布的,所以把它下载下来,看看内容。

经过查询发现,关于.xlsx的XML的cell里的字段定义是在SpreadsheetML里定义的,以下是ECMA原文内容:

8.4 SpreadsheetML
This subclause introduces the overall form of a SpreadsheetML package, and identifies some of its main element types.
(See Part 3 for a more detailed introduction.) A SpreadsheetML package
has a relationship of type officeDocument, which specifies the
location of the main part in the package. For a SpreadsheetML
document, that part contains the workbook definition.

这里它说详细的内容在第三部分,于是我找到第三部分,发现果然是有的:
在这里插入图片描述

在这里插入图片描述

经过查找,补充一部分需要使用的内容,这些都来自于ECMA-376 1st part3

3.2.9.2.1 Shared Formulas

<row r="7" spans="4:8">
  <c r="H7" s="1">
    <f t="shared" ref="H7:H11" ce="1" si="0">SUM(E7:G7)</f>
    <v>1.0246225028914113</v>
  </c>
</row>
<row r="8" spans="4:8">
  <c r="H8" s="1">
    <f t="shared" ce="1" si="0">SUM(E8:G8)</f>
    <v>0.9063376048733931</v>
  </c>
</row>

Just as strings in cells can be extremely pervasive and redundant in a sheet (and therefore must be optimized), formulas are also extremely pervasive in a sheet, and often can be optimized.

Consider the table in the above example, where column H contains a formula that sums the numbers in columns E through G, for each row.

The only difference between the formulas in H6:H12 is that the reference increases by 1 row from one row to the next.

Therefore, an optimization is created where only the formula in H6 needs to be written out, with some additional information indicating how far to propagate the formula once loaded.

This enables the loading application to load and parse only the first of the shared formulas, and then more quickly apply the necessary transforms to produce the additional related formulas in subsequent cells.

Note that while formulas can be shared, it is desirable to enable easy access to the contents of a cell.

Therefore, it is allowed that all formulas may be written out, but only the primary formula in a shared formula need be loaded and parsed.

3.2.9 Cell

<c r="B3">
  <f>B2+1</f>
  <v>2</v>
</c>

The cell itself is expressed by the c collection.

Each cell indicates it’s location in the grid using A1-style reference notation.

A cell can also indicate a style identifier (attribute s) and a data type (attribute t).

The cell types include string, number, and Boolean.

In order to optimize load/save operations, default data values are not written out.

3.2.9.1 Cell Values

Cells contain values, whether the values were directly typed in (e.g., cell A2 in our example has the value External Link:) or are the result of a calculation (e.g., cell B3 in our example has the formula B2+1).

String values in a cell are not stored in the cell table unless they are the result of a calculation.

Therefore, instead of seeing External Link: as the content of the cell’s v node, instead you see a zero-based index into the shared string table where that string is stored uniquely.

This is done to optimize load/save performance and to reduce duplication of information.

To determine whether the 0 in v is a number or an index to a string, the cell’s data type must be examined.

When the data type indicates string, then it is an index and not a numeric value.

在ECMA part4中,终于找到了关于公式的定义

在这里插入图片描述

还有普通类型的定义
在这里插入图片描述

当前就补充到这里,如果后续有需要再返回来补充

标签:Xlsx,formulas,POI,Excel,cell,formula,shared,ECMA,376
来源: https://blog.csdn.net/weixin_42072754/article/details/110849603

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有