IMO, open datasets are more impactful than open models these days! You can now filter almost 200,000 of them on HF by modality, size and format: https://lnkd.in/dp_RqpY
🎉 Just shipped! Big update of the Hugging Face Datasets page. You can now filter datasets: 1. By Modalities (🖼️ Image, 🔊 Audio, 📝 Text, ...) 2. By Dataset Size (from 1k to ∞ samples) 3. By Format (JSON, CSV, Parquet, ...) Should be easier to find the perfect dataset(s) for your next project(s)
Clem Delangue 🤗 This is great! One thing I would really love though, as an avid user of Hugging Face 🤗 to extent that it’s one of my top most DNS-requested domains, is the ability to keyword search or even better vector search for models and datasets based on their tags and READMEs (I know full-text exists but it searches for files not repositories AFAIK). Right now, if you search for “law”, neither my EmuBert Australian legal model nor my Open Australian Legal Corpus would show up because they don’t contain the word “law” in their title. Instead I have to use Google to find domain specific models and dataset since there’s no guaranteeing they may have relevant keywords in their titles.
Thanks for sharing!
Useful tips
This is going to foster lot of innovation and try outs ..
I ❤️ Hugging Face
You are right. Thank you !
Love this! Towards truly open-source AI.
Very useful feature! Thank you for sharing. 🙌🏻
Could we have a filter by the refreshed date as well? Want to use the latest one…