Florian Hönicke’s Post

Principal AI Engineer at Jina AI - Keynote Speaker

Thanks, Christian Haug for organizing this two-week event 👏 https://www.zuberlin.city/ It was fun engaging with the curious audience, especially during the conversations and follow-ups after my talk. I hope you will organize many more events of this kind in the future. Jina AI will be happy to participate again.

6 Comments

Verena Weber

Looks fun!

1 Reaction

Zhen Wang

People Service @Jina AI | Embedding&Vector Tech Advocate

Where is the mistaking finding part? quite looking forward to it 😜

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
1w
Report this post
We can all learn from Siemens! Jina AI was invited to the tangere Events by Siemens. We looked at the main challenges they pointed out in communication, marketing, and sales. Together with other AI Startups, we prototyped and presented possible Solutions our company can offer. That is a great way of initiating a collaboration. Other traditional companies can learn from them. If you work at a big corporation that has difficulties adjusting to the upcoming age of AI, then please organize some events like this. It will help you to get the right impulses and prepare your organization for the future. Therefore, you mitigate risk in the long term. Also, the event was fun and I'm happy to join this event and meet some inspiring people like Dr. Hannes Fleischer, Johannes Kuhn, Lukas Wuttke, Vinzenz Dimpflmaier, Sebastian Winkler, Armin Berger, ...
- +1
2 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
1w
Report this post
Do you sometimes feel like fighting against your LLM? 🥊 (yes of course you do) Let me introduce you to 𝐀𝐢𝐤𝐢𝐝𝐨–𝐩𝐫𝐨𝐦𝐩𝐭𝐢𝐧𝐠 (paradigm we use at https://jina.ai/). Don’t fight the bias, but use it to your advantage. During my last 4 years of prompt engineering, I sometimes found myself in a situation where the LLM did not want to follow a simple instruction. In the screenshot, you can see how I tell GPT 3.5 to only generate the section content in markdown format, but not the title. For some reason, it completely ignores my instruction and happily generates the title as well. It turns out that even after motivating the model a bit more (almost threatening it), sometimes the generations include a headline at the start. But I want you to remember: "𝐀𝐢𝐤𝐢𝐝𝐨 𝐢𝐬 𝐧𝐨𝐭 𝐚𝐛𝐨𝐮𝐭 𝐟𝐢𝐠𝐡𝐭𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐨𝐫 𝐝𝐞𝐟𝐞𝐚𝐭𝐢𝐧𝐠 𝐭𝐡𝐞 𝐞𝐧𝐞𝐦𝐲; 𝐢𝐭 𝐢𝐬 𝐚𝐛𝐨𝐮𝐭 𝐫𝐞𝐜𝐨𝐧𝐜𝐢𝐥𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐭𝐡𝐞𝐦." - 𝑀𝑜𝑟𝑖ℎ𝑒𝑖 𝑈𝑒𝑠ℎ𝑖𝑏𝑎, 𝑓𝑜𝑢𝑛𝑑𝑒𝑟 𝑜𝑓 𝐴𝑖𝑘𝑖𝑑𝑜 Thinking this way the solution was straightforward. I now explicitly instruct GPT 3.5 to generate the title and when parsing the output I just remove the first line. That way I get "deterministic" behavior from the LLM and handle everything it does not want to do. Another common example is code generation. You might have seen some fencing syntax like triple backticks (```) or some **filename.py** at the start of your generated code. Yes, you can go into endless prompt engineering to generate code without the extra syntax, but you can also just generate it in the way the LLM wants it and have your regex to clean it up in the post-processing step. Since there is a very limited number of instructions your model can follow at the same time, removing unnecessary ones is always a good idea. Let me know about your current fights you have with the LLMs and let's think together about how we can resolve them by thinking the Aikido way.
8 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
3w Edited
Report this post
Can you find all three oopsies in the picture? I'm back in Berlin after the European AI Startup Matchday in Malmö 🇸🇪. It was a great opportunity to engage with all the people from different industries. So, thanks to the organizers for inviting us. Also a special thanks to Dr. Philip Hutchinson for taking this photo. It was meant to become a showcase picture, but I just found 3 mistakes (feel free to find more). Do you spot them all?
6 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
1mo
Report this post
LLMs are bad in knowledge. That is great! Once you don't see them as a slow database anymore but as a reasoning engine, you realize that grounding is absolutely essential for GenAI applications. This can be achieved by augmenting your prompts with real data. Therefore, we released https://jina.ai/reader/ recently. It allows you to create an LLM-friendly version of a website to enable the LLM to reason about the website content. Today, we released a new feature that allows you to get LLM-friendly web search results for your GenAI app. You just need to take the base URL (https://s.jina.ai/) and attach your search term. Example: https://lnkd.in/e-k4XiEJ Would be interesting to see what you build and what you find it useful for. For me, it is helping when working on agents and question-answering for public data that is not part of the training dataset.
5 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
2mo
Report this post
🤔 Do you wait for the next embedding model? Don't! Just tell us what you want your embeddings to excel at, e.g., car insurance claims, financial news, or Spanish dialogs. Specify your wish in a prompt; and remember this is your only input to our API. In about 30 mins, we then deliver a ready-to-use, fine-tuned embedding model that can be loaded via SentenceTransformers. Behind the scenes, we take care of everything else: from generating useful synthetic data to managing the train-eval-test ML workflow, and finally, uploading the fine-tuned model to the Hugging Face Hub. Yep, under this very minimal UI abstraction, so much happens! This is a new feature that we are alpha-testing with invited users. A minimalistic fine-tuning UX that eliminates the need for uploading reference data and manual triplet/hard-negative mining. As a user, you simply need to specify your expectations. For instance, "I want my embeddings to excel at biomedical literature" or for a more detailed instruction, "Please make it more effective on various subfields of artificial intelligence, particularly focusing on distinctions between machine learning, deep learning, and neural networks." But how can we ensure the quality of the fine-tuned models? By feeding them high-quality data! To be frank, it’s not an easy job especially when we talk about synthetic data from LLMs. It’s easy to get started but hard to get it right. Simple prompting can give some ("boring") results, but finding diversified, effective and hard-negative triplets requires significant prompt engineering. We proposed a Stochastic Augmented Generation framework, which has proven to be highly effective in generating effective training data for embedding models, under a configurable budget. Have a look at the graphic to find out more.
25 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
2mo
Report this post
I'm proud to announce Reader https://jina.ai/reader/ from Jina AI! 🙌 Many prompt engineers recently faced the problem of combining website content with their prompts. You had to clean the HTML and parse the website layout. This problem is now solved and takes a lot of work from the shoulders of prompt engineers like me. I just need to pass the URL of a website and get the LLM-friendly version returned. The main use cases for me are Agents, RAG with web search, and synthetic data generation that perfectly fits to a website. Here is an example that takes the URL of my Twitter profile as input and generates the LLM-friendly version: https://lnkd.in/etedGbS3 #promptengineering
9 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
3mo
Report this post
I was happy to host an event for experienced leaders from the Swiss industry and governmental institutions in our Jina AI office. Thanks, Dagmar Muth, Zhen Wang, Milan Kostic, and SIB Swiss Institute for Business Administration for making this possible. Also a big thank you to all the participants who were happy to make a deep dive into synthetic data generation and embedding models: Sophie Renevey, Vanja Gubler, Yves Roux, Roland Marthaler, Peter Cucuz, Michelle P., Antonia Rizzi-Ursch, Selma Gueth, Daniel Bellani, Isabelle Baldenweg It was a fun time and I hope to see you again soon. If you have any more questions on how you can use our technology in your business, please reach out to Saahil Ognawala or Francesco Kruk
12 Comments
Like Comment
To view or add a comment, sign in
Florian Hönicke

Principal AI Engineer at Jina AI - Keynote Speaker
4mo
Report this post
I feel honored that I had the chance to attend an award ceremony where Dunja Hayali (ZDF moderator) received the TALISMAN from Deutschlandstiftung Integration. She contributed to social cohesion using her journalistic mindset of "trying to understand without the need of understanding" (Verstehen wollen ohne gleich Verständnis zu haben). She is modest. From her point of view, her contributions to democracy are normal. This is exactly why she deserves it so much. It was great talking to her and other special guests like Christian Wulff (former German Federal President) and Annalena Baerbock (German Foreign Minister). The Deutschlandstiftung Integration connects young people with migration biographies to Mentors who can help them on their path. As one of the mentors, I'm happy to announce that today marks the beginning of my new mentorship with Ali Omo. Looking forward to getting started 🚀. #DeutschlandStiftungIntegration #DSI #TALISMAN2024 #Zusammenhalt
3 Comments
Like Comment
To view or add a comment, sign in

27,740 followers

View Profile Follow

Florian Hönicke’s Post

More from this author

Virtual Dressing Room

Explore topics