Explainable AI Featured Posts Natural Language Processing Podcasts

Podcast: Creating Custom LLMs

Despite GPT, Claude, Gemini, LLama and the other host of LLMs that we have access to, a variety of organizations are still exploring their options when it comes to custom LLMs. Logging in to ChatGPT is easy enough, and so is creating a ‘custom’ openAI GPT, but what does it take to create a truly […]

Read More
Data Sets Deep Learning Featured Posts Synthetic Data Time Series

Synthesizing Multi-Table Databases: Model Evaluation & Vendor Comparison

Synthesizing multi-table tabular data presents its own challenges, compared to single-table. When the database contains date columns such as transaction or admission date, a frequent occurrence in real-world datasets, generating high quality synthetizations and model evaluation are even more complicated. In this article, we focus on this type of problems, comparing generated observations produced by […]

Read More
Explainable AI Featured Posts Machine Learning Natural Language Processing

New Trends in LLM: Overview with Focus on xLLM

If you ever wondered how xLLM is different from other LLM and RAG architectures, what are the foundational changes that make it appealing to fortune 100 companies, and what are the innovations being copied by competitors, read on. In this article, I share the latest trends and provide a high-level summary of xLLM, describing the […]

Read More
Books Courses Deep Learning Explainable AI Featured Posts Generative AI Machine Learning Natural Language Processing Python Synthetic Data Time Series Visualization

New Book: State of the Art in GenAI & LLMs — Creative Projects, with Solutions

With 23 top projects, 96 subprojects, and 6000 lines of Python code, this vendor-neutral coursebook is a goldmine for any analytic professional or AI/ML engineer interested in developing superior GenAI or LLM enterprise apps using ground-breaking technology. This is not another book discussing the same topics that you learn in bootcamps, college classes, Coursera, or […]

Read More
Explainable AI Featured Posts Generative AI Synthetic Data

GenAI Evaluation Metrics: Your Best Loss Functions to Boost Quality

Whether dealing with LLM, computer vision, clustering, predictive analytics, synthetization, or any other AI problem, the goal is to deliver high quality results in as little time as possible.  Typically, you assess the output quality after producing the results, using model evaluation metrics. These metrics are also used to compare various models, or to measure […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Natural Language Processing Python

Breakthrough: Zero-Weight LLM for Accurate Predictions and High-Performance Clustering

While most AI companies keep building LLMs with more weights and tokens (now one trillion is a standard number), I went in the opposite direction. Of course, zero weight means that there is no neural network behind the scenes. More specifically, it means that there is no lengthy Blackbox process to find the “best” weights […]

Read More
Explainable AI Featured Posts Generative AI Natural Language Processing Python

Build and Evaluate High Performance Taxonomy-Based LLMs From Scratch

One obvious way to dramatically improve the quality of LLM and RAG systems is to use high-quality input sources, as opposed to just raw text from the crawled or parsed content. Combine it with specialization: one LLM per top domain, allowing the user to customize parameters and specify the domain in addition to standard concise […]

Read More
Explainable AI Featured Posts Generative AI Natural Language Processing Python

Hallucination-Free, Self-Tuned, Fast Hierarchical LLMs with Multi-Token Embeddings

The new generation of RAG / LLM architecture is moving away from the original monolithic and generic OpenAI model, towards a collection of decentralized and specialized LLMs jointly organized and governed via multi-agent systems. The benefits are obvious: low latency, smaller tables (one per LLM), faster training and fine-tuning, energy-efficient, better results, with much lower […]

Read More
Featured Posts Generative AI Natural Language Processing Python

Extreme LLM: Case Study, Documentation, Best Practices, and Python sources

Extreme LLM, abbreviated as xLLM, relies on multiple specialized large language models, one per top category, to deliver highly relevant answers to specific questions, covering the entire human knowledge or targeted content such as corporate repositories. The user, in addition to the classic prompt, is invited to select or guess top categories. Behind the scenes, […]

Read More
Explainable AI Featured Posts Generative AI Machine Learning Natural Language Processing Synthetic Data Time Series

Probabilistic Nearest Neighbor Search: The Swiss Army Knife of GenAI

ANN — Approximate Nearest Neighbors —  is at the core of fast vector search, itself central to GenAI, especially GPT and LLM. My new methodology, abbreviated as PANN, has many other applications: clustering, classification, measuring the similarity between two datasets (images, soundtracks, time series, and so on), tabular data synthetization (improving poor synthetizations), model evaluation, […]

Read More