MotherDuck’s Post

View organization page for MotherDuck, graphic

15,144 followers

Headed to SciPy? Join Guen Prawiroatmodjo , Elena Felder, Alex Monahan, Mehdi Ouazza, and Nicholas Ursa from the MotherDuck team to learn more about DuckDB's highly efficient, in-memory engine. Tuesday, 7/9: All the SQL a Pythonista needs to know: Introduction to SQL & DataFrames with DuckDB https://lnkd.in/eaT9eKK4 Friday, 7/12: How to Bootstrap a Data Warehouse with DuckDB

How to bootstrap a Data Warehouse with DuckDB SciPy 2024

cfp.scipy.org

To view or add a comment, sign in

More Relevant Posts

Alex Monahan

Forward Deployed Software Engineer at MotherDuck, Docs & Training at DuckDB Labs
2w
Report this post
Do you crunch data in Python? Learning SQL can help you solve larger data problems faster! Join us at SciPy for a crash course tutorial on SQL (tailored for a Pythonista), plus a talk on building your own data warehouse for your team!

MotherDuck

15,144 followers
2w

Headed to SciPy? Join Guen Prawiroatmodjo , Elena Felder, Alex Monahan, Mehdi Ouazza, and Nicholas Ursa from the MotherDuck team to learn more about DuckDB's highly efficient, in-memory engine. Tuesday, 7/9: All the SQL a Pythonista needs to know: Introduction to SQL & DataFrames with DuckDB https://lnkd.in/eaT9eKK4 Friday, 7/12: How to Bootstrap a Data Warehouse with DuckDB

How to bootstrap a Data Warehouse with DuckDB SciPy 2024

cfp.scipy.org
Like Comment
To view or add a comment, sign in
TECHLIVE SOLUTIONS

783 followers
9mo
Report this post
Unlocking the world of SQL commands with Techlive Solutions! 💻🔍 From SELECT to DELETE, we're diving deep into the different types of SQL commands that power our databases. 📊💡 Join us on this data-driven journey! 🚀 #LearnSQL #learnsql #coding #learncoding #learncode #upgradeskills #newskills #SQL #sql #sqldatabase #sqldeveloper
Like Comment
To view or add a comment, sign in
Elena-Luchiana Dumitrescu

Business Intelligence Developer | Medium Writer
5mo
Report this post
Hello, hello 👋 Data cleaning – the great villain of the data science realm, a task that demands attention no matter the weather or our daydreams of lounging on a beach with a cocktail in hand. 🏖️🍹 In my recent exploration, i created a solution that transformed my data-cleaning nightmares into a conquerable challenge. Allow me to introduce you to the NullValueInspector, a custom SSMS-stored procedure that saved my sanity and precious time in the face of expansive tables and countless columns. Until our next SQL or Python adventure, may your tables be tidy, your queries be swift, and your data be ever dependable. Happy querying! 🧑💻🦾 #thankyouforyoursupport #datacleaning #sqlserver #datasciencejourney

Mastering SQL Data Quality: Exploring NULL Values with a Custom SSMS Stored Procedure

medium.com
Like Comment
To view or add a comment, sign in
Antonio Montano 🪄

Delivering perpetual agility via technology ✨
10mo
Report this post
The Role of SQL in LLM Apps and How Snowflake Uses LangChain In an interview with Adrien Treuille, we discuss building data apps with LLMs and SQL, using LangChain, and why Snowflake loves Code Llama #machinelearning

The Role of SQL in LLM Apps and How Snowflake Uses LangChain

https://thenewstack.io
Like Comment
To view or add a comment, sign in
Mustafa Siraj Shaikh

Snowflake Arch l Azure ADF | Azure devOps CI CD | ML | Big Data | Pyspark | DataBricks | DBT | AWS | SQL
7mo
Report this post
In PySpark, there are two methods for converting an RDD into a DataFrame: RDD.toDF() and DataFrame.toDF(). But there's a key difference between them. Let's explore! 🔸RDD.toDF(): ▪️ Used to convert an RDD into a DataFrame. ▪️Specify the initial schema by providing column names. ▪️Handy when you have data in an RDD and want to create a DataFrame. 🔸DataFrame.toDF(): ▪️Used to change column names or add new columns to an existing DataFrame. ▪️Modify the schema of your DataFrame while preserving its data. ▪️Ideal when you're working with an existing DataFrame and need schema updates. The choice between these methods depends on your use case.
Like Comment
To view or add a comment, sign in
Zach Wilson Zach Wilson is an Influencer

Founder @ DataExpert.io | YouTube: Data with Zach | ADHD | contact: zach@dataexpert.io
5mo
Report this post
This is why the DataExpert.io platform will be better than LeetCode: - LeetCode leverages in-browser SQLite database to run your queries against. This makes them very fast but also extremely small scale. - DataExpert.io leverages Apache Iceberg and Trino to run its queries. You execute your queries against a REAL data lake not a database so small it fits into a browser - Leetcode only tests single query efficiency and logic - DataExpert.io allows you to create tables and will test not just your query efficiency but also your modeling efficiency and DAG efficiency I'm really excited to get this SQL and data modeling platform off the ground. It will be ready in a month or so I'm pretty sure! #dataengineering #datascience

33 Comments
Like Comment
To view or add a comment, sign in
Cody Peterson

Product @ Voltron Data | Ibis project | web9 enthusiast
7mo Edited
Report this post
Ever tried analyzing over a billion rows of data locally with pandas? It probably didn't go well. In this follow up blog to a blog replicating another blog, Phillip Cloud takes pandas, Ibis + DuckDB, and Dask for a spin on a large amount of PyPI data. "TL; DR: Ibis has a lot of great backends. They’re all good at different things. For working with local data, it’s hard to beat DuckDB on feature set and performance." Look out for more posts in this series testing out other Python dataframe libraries and Ibis with a variety of backends! Dask, pandas, Polars, DataFusion, and over a dozen other query engines can be used with a consistent dataframe API via Ibis!

Ibis versus X: Performance across the ecosystem part 1

ibis-project.org
Like Comment
To view or add a comment, sign in
Ashimabha Bose , CSM

Certified ScrumMaster® (CSM®) | Senior Business Analyst - Banking | Project Management | Predictive Analytics | Healthcare | Machine Learning | Finance Enthusiast | Blogger | Digital Marketing
11mo
Report this post
🚀 Elevate Your Data Exploration Game: Unveiling the Perfect Pair - PostgreSQL and Jupyter Notebook! 📊💡 Ready to supercharge your data analysis? 📈 Dive into the world of seamless PostgreSQL integration with Jupyter Notebook – a dynamic duo that empowers you to uncover insights like never before. From the simplicity of Psycopg2 to the magic of SQLAlchemy and Pandas, discover the top methods to seamlessly connect your PostgreSQL database with Jupyter Notebook. 🐘📓 Whether you're a SQL aficionado or a Python wizard, these techniques will transform your data exploration journey. But wait, there's more! Learn how to run SQL queries directly with magic commands and even explore the Docker route for a containerized setup. The possibilities are endless and exciting! 🔗 Get ready to embark on a data-driven adventure. Boost your skills, ignite your curiosity, and let the exploration begin! 🌟 #ai #ml #artificialintelligence #machinelearning #dl #deeplearning #neuralnetworks #dataanalytics #datascience #data

Seamless PostgreSQL Integration with Jupyter Notebook: Uncover the Ultimate Data Exploration Duo!

link.medium.com
Like Comment
To view or add a comment, sign in
M. Çağrı AKTAŞ

Data Engineer
9mo
Report this post
Hello everyone, I learned some really basic functions, but these two essential functions are incredibly helpful for our data manipulation. If you're interested, you can check out my article on how to use the two essential functions, "from_json" and "explode" to work with JSON column within CSV files. And I stored all of my code in my GitHub repository. You can check the second link. https://lnkd.in/dax_mmgp https://lnkd.in/d3UMCuUS

What and How to use from_json and exploed functions a json column in csv files with PySpark

medium.com
Like Comment
To view or add a comment, sign in
Delta Lake

56,039 followers
4mo
Report this post
Did you know you can easily create #DeltaLake tables with pandas without needing to depend on Spark? 🙌 You can also append to Delta tables, overwrite Delta tables, and overwrite specific Delta table partitions using #pandas. Delta transactions are implemented differently than pandas operations with other file types like #CSV or #Parquet. Normal pandas transactions irrevocably mutate the data whereas Delta transactions are easy to undo. In this blog post, you'll learn how to create and append to #DeltaLake tables with pandas. 🌊 Dive in: https://lnkd.in/g_KVUW4v #opensource #oss #linuxfoundation

How to create and append to Delta Lake tables with pandas

delta.io

3 Comments
Like Comment
To view or add a comment, sign in

15,144 followers

View Profile Follow

MotherDuck’s Post

More Relevant Posts

Explore topics