LINQ: Distinct values

Question

I have the following item set from an XML:

id           category

5            1
5            3
5            4
5            3
5            3

I need a distinct list of these items:

5            1
5            3
5            4

How can I distinct for Category AND Id too in LINQ?

feelingsofwhite · Accepted Answer · 2016-04-11 15:29:31Z

Are you trying to be distinct by more than one field? If so, just use an anonymous type and the Distinct operator and it should be okay:

var query = doc.Elements("whatever")
               .Select(element => new {
                             id = (int) element.Attribute("id"),
                             category = (int) element.Attribute("cat") })
               .Distinct();

If you're trying to get a distinct set of values of a "larger" type, but only looking at some subset of properties for the distinctness aspect, you probably want DistinctBy as implemented in MoreLINQ in DistinctBy.cs:

 public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
     this IEnumerable<TSource> source,
     Func<TSource, TKey> keySelector,
     IEqualityComparer<TKey> comparer)
 {
     HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
     foreach (TSource element in source)
     {
         if (knownKeys.Add(keySelector(element)))
         {
             yield return element;
         }
     }
 }

(If you pass in null as the comparer, it will use the default comparer for the key type.)

Oh so by "larger type" you may mean I still want all properties in the result even though I only want to compare a few properties to determine distinctness? — Nate Anderson, Commented Sep 20, 2016 at 14:31

Stu · Accepted Answer · 2013-03-12 19:36:57Z

36

Just use the Distinct() with your own comparer.

http://msdn.microsoft.com/en-us/library/bb338049.aspx

edited Mar 12, 2013 at 19:36

answered Jun 15, 2009 at 20:03

Stu

15.7k4 gold badges44 silver badges74 bronze badges

Add a comment |

James Alexander · Accepted Answer · 2009-06-15 20:10:56Z

31

In addition to Jon Skeet's answer, you can also use the group by expressions to get the unique groups along w/ a count for each groups iterations:

var query = from e in doc.Elements("whatever")
            group e by new { id = e.Key, val = e.Value } into g
            select new { id = g.Key.id, val = g.Key.val, count = g.Count() };

answered Jun 15, 2009 at 20:10

James Alexander

6,21210 gold badges43 silver badges57 bronze badges

6

You wrote "in addition to Jon Skeet's answer"... I don't know if such a thing is possible. ;)
– Yehuda Makarov
Commented Dec 12, 2018 at 21:50

Add a comment |

Ricky Gummadi · Accepted Answer · 2015-07-21 03:33:36Z

For any one still looking; here's another way of implementing a custom lambda comparer.

public class LambdaComparer<T> : IEqualityComparer<T>
    {
        private readonly Func<T, T, bool> _expression;

        public LambdaComparer(Func<T, T, bool> lambda)
        {
            _expression = lambda;
        }

        public bool Equals(T x, T y)
        {
            return _expression(x, y);
        }

        public int GetHashCode(T obj)
        {
            /*
             If you just return 0 for the hash the Equals comparer will kick in. 
             The underlying evaluation checks the hash and then short circuits the evaluation if it is false.
             Otherwise, it checks the Equals. If you force the hash to be true (by assuming 0 for both objects), 
             you will always fall through to the Equals check which is what we are always going for.
            */
            return 0;
        }
    }

you can then create an extension for the linq Distinct that can take in lambda's

   public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list,  Func<T, T, bool> lambda)
        {
            return list.Distinct(new LambdaComparer<T>(lambda));
        }

Usage:

var availableItems = list.Distinct((p, p1) => p.Id== p1.Id);

Looking at the reference source, Distinct uses a hash set to store elements it has already yielded. Always returning the same hash code means that every previously returned element is examined every time. A more robust hash code would speed things up because it would only compare against elements in the same hash bucket. Zero is a reasonable default, but it might be worth supporting a second lambda for the hash code. — Darryl, Commented Jul 1, 2017 at 2:50
Good point! I will try edit when I get time, if you are working in this domain at the moment, feel free to edit — Ricky Gummadi, Commented Jul 1, 2017 at 3:44

Olle Johansson · Accepted Answer · 2014-04-30 08:58:22Z

I'm a bit late to the answer, but you may want to do this if you want the whole element, not only the values you want to group by:

var query = doc.Elements("whatever")
               .GroupBy(element => new {
                             id = (int) element.Attribute("id"),
                             category = (int) element.Attribute("cat") })
               .Select(e => e.First());

This will give you the first whole element matching your group by selection, much like Jon Skeets second example using DistinctBy, but without implementing IEqualityComparer comparer. DistinctBy will most likely be faster, but the solution above will involve less code if performance is not an issue.

Omar Ali · Accepted Answer · 2012-03-12 21:16:51Z

4

// First Get DataTable as dt
// DataRowComparer Compare columns numbers in each row & data in each row

IEnumerable<DataRow> Distinct = dt.AsEnumerable().Distinct(DataRowComparer.Default);

foreach (DataRow row in Distinct)
{
    Console.WriteLine("{0,-15} {1,-15}",
        row.Field<int>(0),
        row.Field<string>(1)); 
}

edited Mar 12, 2012 at 21:16

Omar Ali

8,5674 gold badges34 silver badges58 bronze badges

answered Mar 12, 2012 at 9:44

Mohamed Elsayed

411 bronze badge

Add a comment |

Aditya A V S · Accepted Answer · 2018-08-01 07:46:31Z

Since we are talking about having every element exactly once, a "set" makes more sense to me.

Example with classes and IEqualityComparer implemented:

 public class Product
    {
        public int Id { get; set; }
        public string Name { get; set; }

        public Product(int x, string y)
        {
            Id = x;
            Name = y;
        }
    }

    public class ProductCompare : IEqualityComparer<Product>
    {
        public bool Equals(Product x, Product y)
        {  //Check whether the compared objects reference the same data.
            if (Object.ReferenceEquals(x, y)) return true;

            //Check whether any of the compared objects is null.
            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
                return false;

            //Check whether the products' properties are equal.
            return x.Id == y.Id && x.Name == y.Name;
        }
        public int GetHashCode(Product product)
        {
            //Check whether the object is null
            if (Object.ReferenceEquals(product, null)) return 0;

            //Get hash code for the Name field if it is not null.
            int hashProductName = product.Name == null ? 0 : product.Name.GetHashCode();

            //Get hash code for the Code field.
            int hashProductCode = product.Id.GetHashCode();

            //Calculate the hash code for the product.
            return hashProductName ^ hashProductCode;
        }
    }

Now

List<Product> originalList = new List<Product> {new Product(1, "ad"), new Product(1, "ad")};
var setList = new HashSet<Product>(originalList, new ProductCompare()).ToList();

setList will have unique elements

I thought of this while dealing with .Except() which returns a set-difference

Guru Stron · Accepted Answer · 2023-02-02 15:14:41Z

1

Since .NET 6 DistinctBy from the framework itself can be used. Something along these lines (leveraging value tuples):

var query = doc.Elements("whatever")
   .DistinctBy(e => ((int) e.Attribute("id"), (int) e.Attribute("cat")))

answered Feb 2, 2023 at 15:14

Guru Stron

132k11 gold badges142 silver badges180 bronze badges

Add a comment |

Collectives™ on Stack Overflow

LINQ: Distinct values

8 Answers 8

Not the answer you're looking for? Browse other questions tagged
linq
distinct
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Not the answer you're looking for? Browse other questions tagged linqdistinct or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
linq
distinct
or ask your own question.