1

I have (from a CVS file) data in the following format:

The data is a continued list (row by row of comma separated values) where in arbitrary locations columns are inserted or removed. At these occasions, a new 'header' line is added. The first column is always the date. A straight forward import would look like this:

Data source table format

Now what I would need for further data evaluation is a table like the following: Wanted data table format

As I have lots and lots of data in this format, manual rearrangement is not an option.

How can I achieve this resorting automatically?

Either by query-import from the CVS file into Excel, or by subsequent Excel manipulation?

1
  • Based on the first image, does each group represent the data (sum total) for the next few days? In other words, any Data-Item pair in a file won't occur more than once?
    – JohnSUN
    Commented Sep 23, 2022 at 9:47

1 Answer 1

1

From the point of view of computers and relational database theory, it is much better to transform your mixed data into this form:

Form Of New Table

This work can be done quite quickly by the following script:

Option Explicit

Sub repackMixedData()
Dim sheet As Worksheet
Dim rSource As Range
Dim rRow As Range
Dim rHeader As Range
Dim oCellDate As Range
Dim oCell As Range
Dim oTargetCell As Range
    Set sheet = ActiveSheet
    Set rSource = sheet.UsedRange
    Set sheet = ThisWorkbook.Worksheets.Add()
    Set oTargetCell = sheet.Range("A1")
    With oTargetCell.Resize(1, 3)
        .Value = Array("Date", "Item", "Value")
        .Font.Bold = True
    End With
    For Each rRow In rSource.Rows
        Set oCellDate = rRow.Cells(1)
        If oCellDate = "Date" Then
            Set rHeader = rRow
        Else
            For Each oCell In rRow.Offset(0, 1).Cells
                If Not IsEmpty(oCell) Then
                    Set oTargetCell = oTargetCell.Offset(1, 0)
                    oTargetCell.Value2 = oCellDate.Value2
                    oTargetCell.NumberFormat = oCellDate.NumberFormat
                    oTargetCell.Offset(0, 1) = rHeader.Cells(1, oCell.Column).Text
                    oTargetCell.Offset(0, 2) = oCell.Value
                End If
            Next oCell
        End If
    Next rRow
End Sub

And with the resulting "flat" table, you can do anything, for example, create a pivot table and get a result similar to your second screenshot.

Update Since it turned out that there is much more real data than expected, the macro has been slightly improved. Please try this option - it should be a bit faster.

Sub repackMixedData2()
Dim sheet As Worksheet
Dim rSource As Range
Dim rRow As Range
Dim rHeader As Range
Dim oCellDate As Range
Dim oCell As Range
Dim countOfValues As Long
Dim arrResult As Variant
Dim index As Long
    Set sheet = ActiveSheet
    Set rSource = sheet.UsedRange
    countOfValues = Application.WorksheetFunction.CountA(rSource.Offset(0, 1))
ReDim arrResult(1 To countOfValues, 1 To 3) As Variant
    index = 1
    arrResult(index, 1) = "Date"
    arrResult(index, 2) = "Item"
    arrResult(index, 3) = "Value"
    For Each rRow In rSource.Rows
        Set oCellDate = rRow.Cells(1)
        If oCellDate = "Date" Then
            Set rHeader = rRow
        Else
            For Each oCell In rRow.Offset(0, 1).Cells
                If IsEmpty(oCell) Then Exit For
                index = index + 1
                arrResult(index, 1) = oCellDate.Value2
                arrResult(index, 2) = rHeader.Cells(1, oCell.Column).Text
                arrResult(index, 3) = oCell.Value
            Next oCell
        End If
    Next rRow
    
    Set sheet = ThisWorkbook.Worksheets.Add()
    sheet.Range("A1:C" & index).Value2 = arrResult
    sheet.Range("A1:C1").Font.Bold = True
    sheet.Range("A:A").NumberFormat = oCellDate.NumberFormat
End Sub
3
  • Thank you! While I'm not sure this is optimal, it certainly did the trick. The macro runs 3min on my spreadsheet getting to 205k single entries that way.
    – BmyGuest
    Commented Sep 23, 2022 at 12:39
  • 1
    @BmyGuest Oh, I didn’t optimize the code - from the description of the task, I realized that we are talking about data for one year (the screenshot started on January 1st), so we will have to process a little more than 360 lines, for the sake of such a trifle it was not worth thinking about optimization. Of course, a macro can be done much faster: the longer the programmer thinks, the faster the programs work.
    – JohnSUN
    Commented Sep 23, 2022 at 12:48
  • Yes, my example was deceptively small :c) It's not too much dates (4 years) but actually quite a lot of on/off items. My goal is to create - as easily as possible - a "item(data)" line graph for each of the items, but I'm not an Excel expert at all. I was thinking/hoping that some DataQuery wizardry could get my data in shape. I'm otherwise tempted to script a little .txt->.txt command line tool instead. But your code snipped was interesting to see for me. I've learnt something new, so thank you!
    – BmyGuest
    Commented Sep 23, 2022 at 15:10

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .