0

I have a spreadsheet with massive amounts of data (circa 8000 lines). the issue is that of those 8000, there is a group of 9 lines that need the data from various columns across those rows merged into a single row.
For example when I filter a result for a single item, I have 9 rows of it, in several columns I have different values, staggered across those rows and i'd like to try and tidy it up by having only one row with all the data showing in its appropriate column. so for example it looks like this:

start point

And i'm trying to get it to look like this:

end result

Wondering if anyone had some advice on what I can do to speed this process up, instead of manually doing it?

2
  • 3
    Ollie your pictures only show me a rough layout - content is not readible. You might consider linking bigger pictures where one could read the content of the cells. This is necessary because your description does not show how you want to reach a compressed representation.
    – r2d3
    Commented Mar 10, 2022 at 22:07
  • Ollie, btw was the account yours?: superuser.com/users/1647607/ollie, and did you previously ask this question too?: superuser.com/questions/1691741/… Commented Mar 12, 2022 at 3:01

2 Answers 2

1

Eh... it doesn't matter about small pictures, etc. They are clear enough for a general solution. You have a single item in each column and want them all in a single row.

Added late: once you insert a row, like mentioned below, you can actually simply place the following formula in the column B cell of the inserted row:

=INDEX(SORT(B2:B10,,-1),1)

Then copy it to the Clipboard and paste it into all the cells of the row.

You could even make it:

=IF(ISNUMBER(INDEX(SORT(B2:B10,,-1),1)),VALUE(INDEX(SORT(B2:B10,,-1),1)),INDEX(SORT(B2:B10,,-1),1))

to make the "number as text, but mostly usable" problem mentioned below.

(Sorry, thought of it after all the below.)

Isolated approaches the problem in a very nice way if the data is all numeric as it may well be. But if not, the below will do a general solution so long as you truly do have a single item per column in the nine row sets.

=SUBSTITUTE(IFERROR(TRANSPOSE(FILTERXML("<group><element>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(ARRAYTOTEXT(TRANSPOSE(B2:P10),1),",",""),"{",""),"}",""),";","</element><element>")&"</element></group>","/*/*")),""),CHAR(34),"")

Note: its results are all TEXT but are directly usable in calculations. If a value is 9.2, while it shows as '9.2 (so... left justified like letters, does not "take" any number formatting you do to it), if another cell adds 5 to the cell it is in, you get 14.2. You could add further dense-looking steps around the above to fix that, but I stopped at the point of moving everything into a single row.

First, since the first row of your nine rows (or however many if nine was just an example) might have, or maybe certainly does have, a value of its own to preserve, you need to insert a blank row, presumably just above or below these rows. Once it is populated, you can copy it and paste special the values and finally delete the original nine rows.

Usually in cases like this, one uses TEXTJOIN() to make a single string from the material at hand because one wants the empties gone and it will do that for you directly, no extra steps like done here. But if you lose the empties here, you also lose the ability to put the results into the columns they began in.

So ARRAYTOTEXT() is used. It can make a string with a comma-space separator, but then you lose all the column information. Using its "strict" form puts semi-colons at the point columns end. That's needed.

It also adds a {, a } that SUBSTITUTE() removes them (two of the uses of that function here), and it puts doublequotes around any text elements (yes, it distinguishes that 9.2 from "9.2" right up till the end, then presents it as text anyway... go figure...). Those are removed at the very end, but could likely be removed more "inside" if you wanted all that kind of replacement to be done in one group of functions.

At first you get a string that looks like so:

"""a"";""g"";""e"";9.2;""r"";""k"";;;;;""w"";""e""""r"";""b"";""h"";""k"""

(It will later remove all the extra " characters leaving just the single pair around each text element to clear yourself. I did not experiment with it, just said to myself "Let's remove " characters at the end so we don't mess anything up earlier on.")

Notice the ;'s in there maintaining the columns for us? Including the set in the middle that will give those four empty cells in the middle in the end result? (Tried to match your pattern, if not the unreadable data.) Using the strict form allowed that.

TRANSPOSE() is used twice. The one buried inside is because ARRAYTOTEXT() reads across columns, then down to the next row in a range and we need down columns, then up to the first row of the next column. The one closer to the start of the formula is due to FILTERXML(). It wants to present its results in a single column and you need it in a single row. Fortunately, both times the transposition you needed was the simple one TRANSPOSE() does. But if you needed, say, the exact opposite, to start in the lower right corner and read up, then over to the next column and so on, you could use INDEX() to rearrange things that way. It can do all four rearrangements that make geometric sense.

At the end, since those empty columns would result in ERRORS! in those cells, we use IFERROR() to place "" in those cells instead. By the way, while that solves this problem, it does make them not-BLANK if that makes a difference in any place the data is used.

Then that final step of removing the "s from around text results.

All that is the window-dressing that makes the result usable. The actual engine in the formula, the one that makes it all work is using FILTERXML() to take the massaged output from ARRAYTOTEXT() and change it back from a single string of characters into separate pieces of data, including the empty columns' contributions, so that it can fill a row instead of a single cell.

Basically, we use a string before the ARRAYTOTEXT() results and one after them as well as using SUBSITUTE() on those ARRAYTOTEXT() results to make an HTML string that FILTERXML() can make sense of.

(Remember those ; characters we valued so much above? The work can actually be done using a slightly more complex formation of that HTML and we could do away with that SUBSTITUTE() layer. Slightly more complex than that and we could do away with getting rid of the { and } characters. But that isn't handy for people seeing the technique for the first time, so I did not do it here.)

So for a recap, we made a careful string of all the data in the range, then massaged it to remove unneeded things, added to it and massaged it further to make an HTML string, then converted that string into a set of cell values for Excel, and did some final massaging.

There is only one changeable thing in the formula and that is the range to work on. In the above, that was B2:P10. If you need to change it because, say, one set of data has 4 rows or 14 rows, just change that. You could even add LET() around the whole thing, and give the range a name. The purpose of that would be that the range is now right at the start, first line of the formula, and so it is very easy to see and change. Much easier than buried on row whatever of the formula and hard to see amongst the dense muck about it. In this case, not a lot of help, just some, but if you had several things that could need editing, doing that would really shine.

You'll need to Sort before starting to group the rows that will be tidied up. Remember too that you insert a row, paste the formula in the first cell above a data column (the column B cell for that row), then copy to the Clipboard and paste special (values) to have a clean (of formulas) row with results and then, finally, you delete all the original data by deleting their rows.

If you want a simple solution to the numeric entries displaying as text though being basically usable as numbers to Excel... mostly... by changing them back to numbers for sure, useful everywhere Excel finds numbers useful, you can do your work, then use the Paste|Special|Multiply technique multiplying by 1 (enter "1" in some cell, format the cell for how you want numbers displayed, then copy the cell to the Clipboard) and all of them will change back to completely proper numbers while text will not be affected (as in no errors!).

0

Like r2d3 mentioned in the comments, it's impossible to read your sample pictures. The solution here assumes the following:

  1. Your data are numbers.
  2. You only care about one of the numbers for each item/column combination, such as the max number.

With that in mind, you can combine your rows using a MAXIFS formula.

First Step: Copy your items column and paste elsewhere, then remove duplicates. Copy and paste your other column headers, too.

Second Step: Use a formula such as this one to obtain the max value for each item:

    =MAXIFS(B:B,$A:$A,$G2)

Copy across and down, then you're done.

enter image description here

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .