Read Excel as JSON using Open XML SDK

C# Excel file As JSON

In our last article, we learned how to do basic Read/Write Excel using .NET Core. Today in this article we will learn how to perform a C# Excel file As JSON using Open XML SDK.

Today in this article, we will cover below aspects,

We shall be using the below Excel file as an example to read.

C# Read Excel(.xlsx) As JSON

Let’s create a .NET Core project, here to keep it simple I am using a Console .NET Core application.

Let’s install the below package Open-XML-SDK,

PM> Install-Package Open-XML-SDK -Version 2.9.0
C# Excel as JSON

C# Excel file As JSON 

Let’s read the content of the Excel (.xlsx) file using the below method.

static string ReadExcelasJSON()
        {
            try
            {
                DataTable dtTable = new DataTable();
                //Lets open the existing excel file and read through its content . Open the excel using openxml sdk
                using (SpreadsheetDocument doc = SpreadsheetDocument.Open("testdata2.xlsx", false))
                {
                    //create the object for workbook part  
                    WorkbookPart workbookPart = doc.WorkbookPart;
                    Sheets thesheetcollection = workbookPart.Workbook.GetFirstChild<Sheets>();

                    //using for each loop to get the sheet from the sheetcollection  
                    foreach (Sheet thesheet in thesheetcollection.OfType<Sheet>())
                    {
                        //statement to get the worksheet object by using the sheet id  
                        Worksheet theWorksheet = ((WorksheetPart)workbookPart.GetPartById(thesheet.Id)).Worksheet;

                        SheetData thesheetdata = theWorksheet.GetFirstChild<SheetData>();

                      

                        for (int rCnt = 0; rCnt < thesheetdata.ChildElements.Count(); rCnt++)
                        {
                            List<string> rowList = new List<string>();
                            for (int rCnt1 = 0; rCnt1
                                < thesheetdata.ElementAt(rCnt).ChildElements.Count(); rCnt1++)
                            {

                                Cell thecurrentcell = (Cell)thesheetdata.ElementAt(rCnt).ChildElements.ElementAt(rCnt1);
                                //statement to take the integer value  
                                string currentcellvalue = string.Empty;
                                if (thecurrentcell.DataType != null)
                                {
                                    if (thecurrentcell.DataType == CellValues.SharedString)
                                    {
                                        int id;
                                        if (Int32.TryParse(thecurrentcell.InnerText, out id))
                                        {
                                            SharedStringItem item = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id);
                                            if (item.Text != null)
                                            {
                                                //first row will provide the column name.
                                                if (rCnt == 0)
                                                {
                                                    dtTable.Columns.Add(item.Text.Text);
                                                }
                                                else
                                                {
                                                    rowList.Add(item.Text.Text);
                                                }
                                            }
                                            else if (item.InnerText != null)
                                            {
                                                currentcellvalue = item.InnerText;
                                            }
                                            else if (item.InnerXml != null)
                                            {
                                                currentcellvalue = item.InnerXml;
                                            }
                                        }
                                    }
                                }
                                else
                                {
                                    if (rCnt != 0)//reserved for column values
                                    {
                                        rowList.Add(thecurrentcell.InnerText);
                                    }
                                }
                            }
                            if (rCnt != 0)//reserved for column values
                                dtTable.Rows.Add(rowList.ToArray());

                        }

                    }

                    return JsonConvert.SerializeObject(dtTable);
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
                throw;
            }
        }

Generated sample JSON file output is as below,

[
  {
    "ID": "1002",
    "Name": "ABCD",
    "City": "city",
    "Country": "USA"
  },
  {
    "ID": "1003",
    "Name": "PQRS",
    "City": "City",
    "Country": "UK"
  },
  {
    "ID": "1004",
    "Name": "ZYZZ",
    "City": "city3",
    "Country": "China"
  },
  {
    "ID": "1005",
    "Name": "LNMO",
    "City": "city4",
    "Country": "India"
  }
]

Read Excel with Empty Cell values

I have left a few cell entries empty intentionally in the Excel file as below.

Let’s read the content and generate the JSON output for the same.

convert Excel as JSON

Generated JSON file output as below

Empty fields will show as below.

[
  {
    "ID": "1002",
    "Name": "ABCD",
    "City": "",
    "Country": "USA"
  },
  {
    "ID": "1003",
    "Name": "PQRS",
    "City": "City",
    "Country": "UK"
  },
  {
    "ID": "1004",
    "Name": "ZYZZ",
    "City": "city3",
    "Country": "China"
  },
  {
    "ID": "1005",
    "Name": "LNMO",
    "City": "city4",
    "Country": "India"
  }
]

So the above logic works perfectly fine for empty cells as well.

References: Read/Write Excel using .NET Core

Are you dealing with any complex scenarios? Please let me know and sound off your comments below!



Please bookmark this page and share it with your friends. Please Subscribe to the blog to receive notifications on freshly published(2024) best practices and guidelines for software design and development.



12 thoughts on “C# Read Excel file As JSON

  1. Hey! Massively appreciate this article it’s been very helpful…

    Just wondering if you have a solution to if there wasn’t only the table in the excel file but that is what you are wanting to read from?

    Got a table that is underneath a couple of lines of text explaining the data in the table but when I use this code I’m hitting errors because “input array is longer than the number of columns in this table”.

    Let me know if you have any ideas for a solution!

    Thanks,

    Adam.

    1. Hi Adam- Thanks.Glad the article is helpful. Each Excel file Tab can be presented as a table. You can define DataSet overall and embed each table i.e tab.
      For Error “input array is longer than the number of columns in this table” – Please verify you are adding the correct number of columns and rows before creating the table structure. Each row needs to add correctly under the given column if exists. Hope this helps.

      1. Thanks for the quick reply!

        The issue I’m facing is my table begins on line A6 and cells A1-3 are taken up with text, the only information I’m wanting to extract is from the table but struggling to find a solution of being able to ignore the text that precedes it.

        This is where I’m getting the “input array is longer..” error but works as intended when it’s only a table in the sheet.

        Appreciate any help,

        Adam.

    1. Hi Chirag- Thanks for your query. I was able to read empty cells. May I know your sample excel structure?

  2. Not sure what’s changed. But trying this now gives me a whole other output than expected. To begin with, I got an error pointing to a self-referencing loop. I got rid of that by tweaking the code somewhat.
    return JsonConvert.SerializeObject(dtTable, Formatting.None,
    new JsonSerializerSettings
    {
    ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
    });
    Any clues?

  3. Hi-Thanks for this. I have requirement of reading excel as XML output. Could you please help how to get excel as XML output ?

Leave a Reply

Your email address will not be published. Required fields are marked *