0

I am using Power Query in Excel for Microsoft 365.

Here is my source data table (whose table name is Source):

Col1 Col2 Col3 Col4 Name Alt.Name
11 21 31 41 Dog Fido
12 22 32 42 Cat Fluffy
23 33 43 Ostrich Jan
14 24 44 Parakeet Bob
15 25 35 45 Trout Cindy

I have defined two text parameters:

  • ColToKeep specifies which one of the columns Col1, Col2, Col3, and Col4 to keep. The columns not specified will be discarded.
  • NameToKeep specifies which of the columns Name and Alt.Name to keep. The column not specified will be discarded.

My goal is to develop a Power Query to:

  1. Remove all columns except the two specified by ColToKeep and NameToKeep
  2. Filter out rows where there is a null in the column specified by ColToKeep

Assuming I set ColToKeep to Col3 and assuming I set NameToKeep to Alt.Name, the desired result is:

Col3 Alt.Name
31 Fido
32 Fluffy
33 Jan
35 Cindy

However, the actual (incorrect) result is:

Col3 Alt.Name
31 Fido
32 Fluffy
33 Jan
Bob
35 Cindy

Here is the M Language code for my Power Query:

let
    Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Col1", Int64.Type}, {"Col2", Int64.Type}, {"Col3", Int64.Type}, {"Col4", Int64.Type}, {"Name", Text.Type}, {"Alt.Name", Text.Type}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{ColToKeep, NameToKeep}),
    // #"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each [ColToKeep] <> null and [ColToKeep] <> "")
    // #"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ColToKeep <> null and ColToKeep <> "")
    // #"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each Table.ColumnNames(#"Removed Other Columns"){0} <> null and Table.ColumnNames(#"Removed Other Columns"){0} <> "")
in
    #"Filtered Rows"

It is the #"Filtered Rows" applied step that is going wrong. In the code above, I show three (commented-out) attempts I've made at implementing this applied step.

In the first attempt, I simply refer to the column names directly (with brackets). However, ColToKeep and NameToKeep are taken to be references to the columns of interest, not as text strings that hold the column names. An error is thrown because, e.g., there is no column named ColToKeep.

In the second attempt, I again refer to the column names directly, but this time without brackets. This does not throw an error, but I get the incorrect output shown above.

In the third attempt, I refer to the columns by their position since I don't know until runtime which columns are being kept. After the preceding #"Removed Other Columns" applied step is executed, there will be two columns remaining. I assumed their position indexes would be 0 and 1, but I'm suspecting the position indexes have not yet been adjusted when the Table.SelectRows call executes and that I am therefore not referencing the column I intend to when filtering out null values. This does not throw an error, but it does generate the incorrect output shown above.

How may I filter out rows based on null values in the column named by the parameter ColToKeep?

1 Answer 1

1

Temporarily drop the headers to the first row which will number the columns as 1 and 2, then filter out the nulls, then re-promote the first row as headers:

let
    Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content],
    #"Removed Other Columns" = Table.SelectColumns(Source,{ColToKeep, NameToKeep}),
    #"Demoted Headers" = Table.DemoteHeaders(#"Removed Other Columns"),
    #"Changed Type" = Table.TransformColumnTypes(#"Demoted Headers",{{"Column2", type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Column1] <> null)),
    #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [Column2] <> null),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows1", [PromoteAllScalars=true])
in
    #"Promoted Headers"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .