🚀 Launch Alert: NanoDL - The Ultimate Jax/Flax Library for Building Your Transformer Models 🚀 Key Features: - Features a broad range of blocks and layers for custom transformer model creation from scratch. - Includes complete models like LlaMa2, Mistral, Mixtral, GPT3, GPT4 (inferred), T5, Whisper, ViT, Mixers, GAT, CLIP and more. - Equipped with data-parallel distributed trainers, eliminating the need for manual training loops. - Comes with the much-needed dataloader for Jax/Flax. - Provides specialized layers absent in Flax/Jax, such as RoPE, GQA, MQA, and SWin attention. - Features GPU/TPU-boosted classical ML models including PCA, KMeans, Regression, etc. - Flexibility to build hybrid transformer models, combining elements from GPT, Mixtral, LlaMa2, and more. - A plethora of GPU/TPU-accelerated NLP and vision algorithms like Gaussian Blur, BLEU, MAP, etc. - Built on Jax, renowned for its speed in compute-intensive ML tasks, even on CPUs. - Each layer is a Flax Module which can be directly used as you would any Flax layer. - Fully open-source with independently downloadable model files for hassle-free copying😉. Get Started Now: The dev version of NanoDL is available on PyPI. Install it easily with `pip install nanodl`. Ensure Jax, Flax, and Optax are set up correctly. See the repo here for more: https://lnkd.in/djm7qqA6 We can't wait to see the incredible projects you'll develop with NanoDL. Your feedback is crucial to us, so feel free to share your thoughts and experiences. 🔥 #Python #DeepLearning #NanoDL #NewRelease #Jax #Flax #Sklearn #MachineLearning 🔥
NanoDL
Research Services
A Jax-based library for designing and training transformer models from scratch, also GPU-powered SKLearn.
About us
Developing and training transformer-based models is typically resource-intensive and time-consuming and AI/ML experts frequently need to build smaller-scale versions of these models for specific problems. Jax, a low-resource yet powerful framework, accelerates the development of neural networks, but existing resources for transformer development in Jax are limited. NanoDL addresses this challenge with the following features: • A wide array of blocks and layers, facilitating the creation of customised transformer models from scratch. • An extensive selection of models like LlaMa2, Mistral, Mixtral, GPT3, GPT4 (inferred), T5, Whisper, ViT, Mixers, GAT, CLIP, and more, catering to a variety of tasks and applications. • Data-parallel distributed trainers so developers can efficiently train large-scale models on multiple GPUs or TPUs, without the need for manual training loops. • Dataloaders, making the process of data handling for Jax/Flax more straightforward and effective. Custom layers not found in Flax/Jax, such as RoPE, GQA, MQA, and SWin attention, allowing for more flexible model development. • GPU/TPU-accelerated classical ML models like PCA, KMeans, Regression, Gaussian Processes etc., akin to SciKit Learn on GPU. • Modular design so users can blend elements from various models, such as GPT, Mixtral, and LlaMa2, to craft unique hybrid transformer models. • A range of advanced algorithms for NLP and computer vision tasks, such as Gaussian Blur, BLEU etc. • Each model is contained in a single file with no external dependencies, so the source code can also be easily used.
- Website
-
https://github.com/HMUNACHI/nanodl
External link for NanoDL
- Industry
- Research Services
- Company size
- 2-10 employees
- Headquarters
- London
- Type
- Nonprofit
- Founded
- 2020
- Specialties
- artificial intelligence, machine learning, natural language processing, computer vision, and computational neuroscience
Locations
-
Primary
6 Scarlet Close
East Village, Stratford
London, E20 1FN, GB
Updates
-
What's New in NanoDL 1.2.0.dev1: - Gemma architecture from Google DeepMind - Reward model wrapper and data-parallel distributed reward trainer. - True random module in Jax which bypasses verbosity by internally using current time to generate keys.
-
APIs and HuggingFace provide solid pre-trained LLMs, but in those special situations where you need to build efficient custom models with or without the same architecture, NanoDL simplifies and streamlines the process. https://lnkd.in/djm7qqA6
-
While the true implementation details of OpenAI's #gpt4 are closely guarded by the company, there are rumours the company using a Mixture-of-Experts technique. With NanoDL, you can implement this inferred GPT4 in a few lines of code! https://lnkd.in/dcJf656K
goonlinetools.com
-
Whisper from OpenAI is one of the most powerful audio-to-speech models. To implement this architecture, see the provided link for the codes. With NanoDL, it only takes a few lines of code. https://lnkd.in/dhx9Ywfd
goonlinetools.com
-
With NanoDL, you can build your Diffusion Model in fewer lines! #gpt #diffusionmodels #deeplearning #jax #flax #pytorch #tensorflow #sklearn
-