6

I created a reporting tool as part of an internal web application. The report displays all results in a GridView, and I used JavaScript to read the contents of the GridView row-by-row into an Excel object. The JavaScript goes on to create a PivotTable on a different worksheet.

Unfortunately I didn't expect that the size of the GridView would cause overloading problems with the browser if more than a few days are returned. The application has a few thousand records per day, let's say 60k per month, and ideally I'd like to be able to return all results for up to a year. The number of rows is causing the browser to hang or crash.

We're using ASP.NET 3.5 on Visual Studio 2010 with SQL Server and the expected browser is IE8. The report consists of a gridview that gets data from one out of a handful of stored procedures depending on which population the user chooses. The gridview is in an UpdatePanel:

<asp:UpdatePanel ID="update_ResultSet" runat="server">
<Triggers>
    <asp:AsyncPostBackTrigger ControlID="btn_Submit" />
</Triggers>
<ContentTemplate>
<asp:Panel ID="pnl_ResultSet" runat="server" Visible="False">
    <div runat="server" id="div_ResultSummary">
        <p>This Summary Section is Automatically Completed from Code-Behind</p>
    </div>
        <asp:GridView ID="gv_Results" runat="server" 
            HeaderStyle-BackColor="LightSkyBlue" 
            AlternatingRowStyle-BackColor="LightCyan"  
            Width="100%">
        </asp:GridView>
    </div>
</asp:Panel>
</ContentTemplate>
</asp:UpdatePanel>

I was relatively new to my team, so I followed their typical practice of returning the sproc to a DataTable and using that as the DataSource in the code behind:

    List<USP_Report_AreaResult> areaResults = new List<USP_Report_AreaResult>();
    areaResults = db.USP_Report_Area(ddl_Line.Text, ddl_Unit.Text, ddl_Status.Text, ddl_Type.Text, ddl_Subject.Text, minDate, maxDate).ToList();
    dtResults = Common.LINQToDataTable(areaResults);

    if (dtResults.Rows.Count > 0)
    {
        PopulateSummary(ref dtResults);
        gv_Results.DataSource = dtResults;
        gv_Results.DataBind();

(I know what you're thinking! But yes, I have learned much more about parameterization since then.)

The LINQToDataTable function isn't anything special, just converts a list to a datatable.

With a few thousand records (up to a few days), this works fine. The GridView displays the results, and there's a button for the user to click which launches the JScript exporter. The external JavaScript function reads each row into an Excel sheet, and then uses that to create a PivotTable. The PivotTable is important!

function exportToExcel(sMyGridViewName, sTitleOfReport, sHiddenCols) {
//sMyGridViewName = the name of the grid view, supplied as a text
//sTitleOfReport = Will be used as the page header if the spreadsheet is printed
//sHiddenCols = The columns you want hidden when sent to Excel, separated by semicolon (i.e. 1;3;5).
//              Supply an empty string if all columns are visible.

var oMyGridView = document.getElementById(sMyGridViewName);

//If no data is on the GridView, display alert.
if (oMyGridView == null)
    alert('No data for report');
else {
    var oHid = sHiddenCols.split(";");  //Contains an array of columns to hide, based on the sHiddenCols function parameter
    var oExcel = new ActiveXObject("Excel.Application");
    var oBook = oExcel.Workbooks.Add;
    var oSheet = oBook.Worksheets(1);
    var iRow = 0;
    for (var y = 0; y < oMyGridView.rows.length; y++)
    //Export all non-hidden rows of the HTML table to excel.
    {
        if (oMyGridView.rows[y].style.display == '') {
            var iCol = 0;
            for (var x = 0; x < oMyGridView.rows(y).cells.length; x++) {
                var bHid = false;
                for (iHidCol = 0; iHidCol < oHid.length; iHidCol++) {
                    if (oHid[iHidCol].length !=0 && oHid[iHidCol] == x) {
                        bHid = true;
                        break; 
                    } 
                }
                if (!bHid) {
                    oSheet.Cells(iRow + 1, iCol + 1) = oMyGridView.rows(y).cells(x).innerText;
                    iCol++;
                }
            }
            iRow++;
        }
    }

What I'm trying to do: Create a solution (probably client-side) that can handle this data and process it into Excel. Someone might suggest using the HtmlTextWriter, but afaik that doesn't allow for automatically generating a PivotTable and creates an obnoxious pop-up warning....

What I've tried:

  • Populating a JSON object -- I still think this has potential but I haven't found a way of making it work.
  • Using a SQLDataSource -- I can't seem to use it to get any data back out.
  • Paginating and looping through the pages -- Mixed progress. Generally ugly though, and I still have the problem that the entire dataset is queried and returned for each page displayed.

Update: I'm still very open to alternate solutions, but I've been pursuing the JSON theory. I have a working server-side method that generates the JSON object from a DataTable. I can't figure out how to pass that JSON into the (external) exportToExcel JavaScript function....

    protected static string ConstructReportJSON(ref DataTable dtResults)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append("var sJSON = [");
        for (int r = 0; r < dtResults.Rows.Count; r++)
        {
            sb.Append("{");
            for (int c = 0; c < dtResults.Columns.Count; c++)
            {
                sb.AppendFormat("\"{0}\":\"{1}\",", dtResults.Columns[c].ColumnName, dtResults.Rows[r][c].ToString());
            }
            sb.Remove(sb.Length - 1, 1); //Truncate the trailing comma
            sb.Append("},");
        }
        sb.Remove(sb.Length - 1, 1);
        sb.Append("];");
        return sb.ToString();
    }

Can anybody show an example of how to carry this JSON object into an external JS function? Or any other solution for the export to Excel.

5
  • "Populating a JSON object" ...My first thought as I was reading the question :)
    – Chandu
    Commented Jul 6, 2012 at 23:00
  • Maybe you can use and Excel object directly? msdn.microsoft.com/en-us/library/wss56bz7(v=vs.80).aspx
    – Aerik
    Commented Jul 6, 2012 at 23:01
  • Are you really creating an Excel object via client-side js? Seems like you'd need incredibly low browser security setting for this. Wouldn't it be easier to create a process by which the Excel file gets populated on the server and then downloaded to the client? Commented Jul 6, 2012 at 23:21
  • @Chandu - It's still my thought too! I feel like I've gotten close -- I have a working method to generate the JSON object -- but I still can't figure out how to plug the JSON into the above JS. :-( Aerik - Maybe. If this were a fat app, that's what I'd do. Is it possible to make Excel interop work with a web app? John - Oops, my bad.
    – RJB
    Commented Jul 10, 2012 at 16:02
  • TimWilliams - The browsers are set to Medium-High. Since it's an IE-only environment, client-side is possible with the ActiveX JS.
    – RJB
    Commented Jul 10, 2012 at 16:07

4 Answers 4

9

It's easy and efficient to write CSV files. However, if you need Excel, it can also be done in a reasonably efficient way, that can handle 60,000+ rows by using the Microsoft Open XML SDK's open XML Writer.

  1. Install Microsoft Open SDK if you don't have it already (google "download microsoft open xml sdk")
  2. Create a Console App
  3. Add Reference to DocumentFormat.OpenXml
  4. Add Reference to WindowsBase
  5. Try running some test code like below (will need a few using's)

Just Check out Vincent Tan's solution at http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/ ( Below, I cleaned up his example slightly to help new users. )

In my own use I found this pretty straight forward with regular data, but I did have to strip out "\0" characters from my real data.

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

...

        using (var workbook = SpreadsheetDocument.Create("SomeLargeFile.xlsx", SpreadsheetDocumentType.Workbook))
        {
            List<OpenXmlAttribute> attributeList;
            OpenXmlWriter writer;

            workbook.AddWorkbookPart();
            WorksheetPart workSheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();

            writer = OpenXmlWriter.Create(workSheetPart);
            writer.WriteStartElement(new Worksheet());
            writer.WriteStartElement(new SheetData());

            for (int i = 1; i <= 50000; ++i)
            {
                attributeList = new List<OpenXmlAttribute>();
                // this is the row index
                attributeList.Add(new OpenXmlAttribute("r", null, i.ToString()));

                writer.WriteStartElement(new Row(), attributeList);

                for (int j = 1; j <= 100; ++j)
                {
                    attributeList = new List<OpenXmlAttribute>();
                    // this is the data type ("t"), with CellValues.String ("str")
                    attributeList.Add(new OpenXmlAttribute("t", null, "str"));

                    // it's suggested you also have the cell reference, but
                    // you'll have to calculate the correct cell reference yourself.
                    // Here's an example:
                    //attributeList.Add(new OpenXmlAttribute("r", null, "A1"));

                    writer.WriteStartElement(new Cell(), attributeList);

                    writer.WriteElement(new CellValue(string.Format("R{0}C{1}", i, j)));

                    // this is for Cell
                    writer.WriteEndElement();
                }

                // this is for Row
                writer.WriteEndElement();
            }

            // this is for SheetData
            writer.WriteEndElement();
            // this is for Worksheet
            writer.WriteEndElement();
            writer.Close();

            writer = OpenXmlWriter.Create(workbook.WorkbookPart);
            writer.WriteStartElement(new Workbook());
            writer.WriteStartElement(new Sheets());

            // you can use object initialisers like this only when the properties
            // are actual properties. SDK classes sometimes have property-like properties
            // but are actually classes. For example, the Cell class has the CellValue
            // "property" but is actually a child class internally.
            // If the properties correspond to actual XML attributes, then you're fine.
            writer.WriteElement(new Sheet()
            {
                Name = "Sheet1",
                SheetId = 1,
                Id = workbook.WorkbookPart.GetIdOfPart(workSheetPart)
            });

            writer.WriteEndElement(); // Write end for WorkSheet Element
            writer.WriteEndElement(); // Write end for WorkBook Element
            writer.Close();

            workbook.Close();
        }

If you review that code you'll notice two major writes, first the Sheet, and then later the workbook that contains the sheet. The workbook part is the boring part at the end, the earlier sheet part contains all the rows and columns.

In your own adaptation, you could write real string values into the cells from your own data. Instead, above, we're just using the row and column numbering.

writer.WriteElement(new CellValue("SomeValue"));

Worth noting, the row numbering in Excel starts at 1 and not 0. Starting rows numbered from an index of zero will lead to "Corrupt file" error messages.

Lastly, if you're working with very large sets of data, never call ToList(). Use a data reader style methodology of streaming the data. For example, you could have an IQueryable and utilize it in a for each. You never really want to have to rely on having all the data in memory at the same time, or you'll hit an out of memory limitation and/or high memory utilization.

For large files to be streamed you may want to try to supply the output stream instead of the file name; however, according to this answer OpenXml and HttpResponse.OutputStream the OpenXML library needs to be able to seek back to the beginning, so saving to a file before hosting the file seems like the only choice for large amounts of data.

1
  • 2
    This was really good to get me started. Lot's of gotchas but once you understand the schemas it makes a lot more sense. Spend some time looking at the underlying xmls in a valid xlsx
    – KornMuffin
    Commented Apr 25, 2022 at 4:11
1

I would try to use displaytag to display the results. You could set it up display a certain number per page, which should solve your overloading issue. Then, you can set displaytag to allow for an Excel export.

2
  • Not a bad idea, but I'd like to avoid using third-party addons if at all possible.
    – RJB
    Commented Jul 10, 2012 at 15:44
  • @RJB Why avoid third-party libraries? Especially a light-weight library such as displaytag. They can be your best friend. Why re-invent the wheel? Commented Jul 11, 2012 at 15:17
1

We typically handle this with an "Export" command button which is wired up to a server side method to grab the dataset and convert it to CSV. Then we adjust the response headers and the browser will treat it as a download. I know this is a server side solution, but you may want to consider it since you'll continue having timeout and browser issues until you implement server side record paging.

2
  • I think that this would be similar to the HtmlTextWriter method, with some similar drawbacks. But I'm not opposed to a server-side solution, so long as it's still capable of programmatically manipulating Excel to generate a PivotTable.
    – RJB
    Commented Jul 10, 2012 at 15:50
  • I believe there is an API to do advanced workbook excel file creation, such as progmatically creating the pivot. In our situation, we typically pivot and manipulate the data in C# before exporting as CSV.
    – mellodev
    Commented Jul 10, 2012 at 20:01
0

Almost a week and a half since I began this problem, I've finally managed to get it all working to some extent. I'll wait temporarily from marking an answer to see if anybody else has a more efficient, better 'best practices' method.

By generating a JSON string, I've divorced the JavaScript from the GridView. The JSON is generated in code behind when the data is populated:

    protected static string ConstructReportJSON(ref DataTable dtResults)
    {
        StringBuilder sb = new StringBuilder();
        for (int r = 0; r < dtResults.Rows.Count; r++)
        {
            sb.Append("{");
            for (int c = 0; c < dtResults.Columns.Count; c++)
            {
                sb.AppendFormat("\"{0}\":\"{1}\",", dtResults.Columns[c].ColumnName, dtResults.Rows[r][c].ToString());
            }
            sb.Remove(sb.Length - 1, 1); //Truncate the trailing comma
            sb.Append("},");
        }
        sb.Remove(sb.Length - 1, 1);
        return String.Format("[{0}]", sb.ToString());
    }

Returns a string of data such as

[ {"Caller":"John Doe", "Office":"5555","Type":"Incoming", etc},

{"Caller":"Jane Doe", "Office":"7777", "Type":"Outgoing", etc}, {etc} ]

I've hidden this string by assigning the text to a Literal in the UpdatePanel using:

    <div id="div_JSON" style="display: none;">
            <asp:Literal id="lit_JSON" runat="server" /> 
    </div>

And the JavaScript parses that output by reading the contents of the div:

function exportToExcel_Pivot(sMyJSON, sTitleOfReport, sReportPop) {
     //sMyJSON = the name, supplied as a text, of the hidden element that houses the JSON array.
     //sTitleOfReport = Will be used as the page header if the spreadsheet is printed.
     //sReportPop = Determines which business logic to create a pivot table for.

var sJSON = document.getElementById(sMyJSON).innerHTML;
var oJSON = eval("(" + sJSON + ")");

 //    DEBUG Example Test Code
 //    for (x = 0; x < oJSON.length; x++) {
 //        for (y in oJSON[x])
 //            alert(oJSON[x][y]); //DEBUG, returns field value
 //            alert(y); //DEBUG, returns column name
 //    }


//If no data is in the JSON object array, display alert.
if (oJSON == null)
    alert('No data for report');
else {
    var oExcel = new ActiveXObject("Excel.Application");
    var oBook = oExcel.Workbooks.Add;
    var oSheet = oBook.Worksheets(1);
    var oSheet2 = oBook.Worksheets(2);
    var iRow = 0;
    var iCol = 0;

        //Take the column names of the JSON object and prepare them in Excel
        for (header in oJSON[0])
        {
            oSheet.Cells(iRow + 1, iCol + 1) = header;
            iCol++;
        }

        iRow++;

        //Export all rows of the JSON object to excel
        for (var r = 0; r < oJSON.length; r++)
        {
            iCol = 0;
            for (c in oJSON[r]) 
                    {
                        oSheet.Cells(iRow + 1, iCol + 1) = oJSON[r][c];
                        iCol++;
                    } //End column loop
            iRow++;
        } //End row

The string output and the JavaScript 'eval' parsing both work surprisingly fast, but looping through the JSON object is a little slower than I'd like.

I believe that this method would be limited to around 1 billion characters of data -- maybe less depending how memory testing works out. (I've calculated that I'll probably be looking at a maximum of 1 million characters per day, so that should be fine, within one year of reporting.)

Not the answer you're looking for? Browse other questions tagged or ask your own question.