Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SXSSFWorkbook generated xlsx file cannot be import by PowerBI #1443

Open
1 task done
msmith-work opened this issue Nov 12, 2024 · 3 comments
Open
1 task done

SXSSFWorkbook generated xlsx file cannot be import by PowerBI #1443

msmith-work opened this issue Nov 12, 2024 · 3 comments

Comments

@msmith-work
Copy link

NPOI Version

Current

File Type

  • XLSX

Reproduce Steps

Using streaming SXSSFWorkbook, generate a simple spreadsheet output

Issue Description

The generated xlsx file contains a stray UTF8 BOM (ZWNBSP) character in xl\worksheets\sheet1.xml between <sheetData><row r="1">.

The file can be opened without issue in Excel, however importing this file into PowerBI results in the error message "'Text' is an invalid XmlNodeType, Line 1, position x"

On investigation, it appears the temp file created for streaming is done so using a UTF8 encoding (SheetDataWriter.cs constructor), this then forces the temp file to contain the UTF8 BOM character, which then appears to be inserted into the final spreadsheet just after the opening <sheetData> tag

@tonyqus
Copy link
Member

tonyqus commented Nov 12, 2024

Can you upload the generated xlsx file? And How do you import this xlsx into PowerBI? Please let me know the reproduction steps.

@tonyqus tonyqus changed the title Streaming writer adding stray UTF8 BOM SXSSFWorkbook generated xlsx file cannot be import by PowerBI Nov 12, 2024
@msmith-work
Copy link
Author

Here is some basic example code and the generated spreadsheet (it affects all spreadsheets)

public Stream BuildExampleReport()
{
    using var workbook = new SXSSFWorkbook();
    var worksheet = workbook.CreateSheet("Sheet 1");
    for (var rowNum = 0; rowNum < 3; rowNum++)
    {
        var row = worksheet.CreateRow(rowNum);
        for (var colNum = 0; colNum < 3; colNum++)
        {
            var cell = row.CreateCell(colNum);
            cell.SetCellValue("Example");
        }
    }
    var memory = new MemoryStream();
    workbook.Write(memory, true);
    memory.Position = 0;
    return memory;
}

example.xlsx

The stray UTF8 BOM character is within the xl\worksheets\sheet1.xml file just after the opening <sheetData> tag.

As for importing into PowerBI, this is a process that our customers do with the generated spreadsheets and is not something we do ourselves; it has now been reported by multiple customers, each customer with their own use cases and processes.

We were previously using the non-streaming XSSF functions to generate the files, and these were generated without issue, it is only since switching to the streaming SXSSF functions that the issue has started.

@scousesheriff
Copy link

Hi.

This problem is not specific to PowerBI, it is creating a malformed document, it's just that Excel must work around it when opening the files. It seems to be a simple case of needing to remove this override to the encoding in the SheetDataWriter:

image

The vast majority of StreamWriters in the solution don't use the override, and rely on the the default encoding, which is UTF8NoBOM:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants