2

I have some HTML files that contain tables, which I need to perform some analysis on.

I can open them in Excel, and it preserves all the table formatting and layout (which is what I want).

The problem is that it, by default, formats all cells as "general". This means Excel's "smart" data conversion kicks in, which, as has been noted my many on stackexchange in the past, causes all kinds of problems when codes and names show up as dates and get converted to a number.

There are ways to get around this when importing from plaintext, forcing Excel to bring up a wizard which allows you to change the import format from "general" to "text". How do I make Excel treat everything as text for an HTML file?

Is there perhaps some way I can change a global Excel setting that stops the general format from converting dates? Or is there some way specifically for opening html files that will stop the "general" format from being applied?

0

2 Answers 2

1

I would use the Power Query Add-In for this. Power Query can read HTML files (local or web). It looks for tables so there will need to be some consistency in the HTML structure. Once the HTML table has been read, it will try to automatically detect dates - you can override this and convert columns manually.

2
  • I will try this, hope IT don't make a fuss
    – Some_Guy
    Commented Sep 21, 2015 at 14:32
  • Pointing them at this info might help - Power Query will be embedded in Excel 2016+ : blogs.office.com/2015/09/10/…
    – Mike Honey
    Commented Sep 22, 2015 at 1:27
1

1 year later, you can use a web query and change the options to disable date recognition, as specified here: https://support.microsoft.com/en-gb/kb/287027

Rather than opening the HTML file, point the web query to a local address (file:///C:/Users/.../file.html)

To prevent Excel from automatically converting numbers to dates, follow these steps when you create a new Web query: In Microsoft Office Excel 2003 or in Microsoft Excel 2002, point to Import External Data on the Data menu, and then click New Web Query.

In Microsoft Office Excel 2007, click From Web in the Get External Data group on the Data tab.

In the Address box, type the address of the Web page that contains the table that you want to import, and then click Go.

Click the appropriate table marker to select the table that you want to import.

Click Options.

Under Other Import settings, click to select the Disable date recognition check box.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .