Han Xiao

Berlin, Berlin, Deutschland Kontaktinformationen

Einloggen, um das vollständige Profil von Han Xiao zu sehen

Schön, dass Sie wieder da sind

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Neu bei LinkedIn? Mitglied werden

oder

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Neu bei LinkedIn? Mitglied werden

20.261 Follower:innen 500+ Kontakte

Gemeinsame Kontakte mit Han Xiao anzeigen

Schön, dass Sie wieder da sind

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Neu bei LinkedIn? Mitglied werden

oder

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Neu bei LinkedIn? Mitglied werden

Anmelden, um das Profil zu sehen

极纳科技

Technische Universität München

Info

Dr. Han Xiao is the Founder & CEO of Jina AI, a commercial opensource company based in…

Aktivitäten

We can all learn from Siemens! Jina AI was invited to the tangere Events by Siemens. We looked at the main challenges they pointed out in…

We can all learn from Siemens! Jina AI was invited to the tangere Events by Siemens. We looked at the main challenges they pointed out in…

Beliebt bei Han Xiao
One cool thing about ColBERT-based search vs. the cosine-based vector retrieval is that you get 𝐢𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 for free as a…

One cool thing about ColBERT-based search vs. the cosine-based vector retrieval is that you get 𝐢𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 for free as a…

Geteilt von Han Xiao
Do you sometimes feel like fighting against your LLM? 🥊 (yes of course you do) Let me introduce you to 𝐀𝐢𝐤𝐢𝐝𝐨–𝐩𝐫𝐨𝐦𝐩𝐭𝐢𝐧𝐠 (paradigm we…

Do you sometimes feel like fighting against your LLM? 🥊 (yes of course you do) Let me introduce you to 𝐀𝐢𝐤𝐢𝐝𝐨–𝐩𝐫𝐨𝐦𝐩𝐭𝐢𝐧𝐠 (paradigm we…

Beliebt bei Han Xiao

Anmelden, um alle Aktivitäten zu sehen

Berufserfahrung und Ausbildung

极纳科技

腾讯

*********** **** ** ******* ** ***
*** ***** **********

***** ****** ** ** ********** ************
********** *********ä* *ü*****

****** ** ********** (**.***.***.) ******** ******* ***** *** *****

2011–2014
********** *********ä* *ü*****

****** ** ******* (*.**.) ******** ******* ****** **** ***********

2009–2011

Gesamte Berufserfahrung von Han Xiao anzeigen

Jobbezeichnung, Beschäftigungsdauer und mehr ansehen.

oder

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Veröffentlichungen

Learning Better while Sending Less: Communication-Efficient Online Semi-Supervised Learning in Client-Server Settings

IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015) 22. Juli 2015
We consider a novel distributed learning problem: A server receives potentially unlimited data from clients in a sequential manner, but only a small initial fraction of these data are labeled. Because communication bandwidth is expensive, each client is limited to sending the server only a small (high-priority) fraction of the unlabeled data it generates, and the server is limited in the amount of prioritization hints it sends back to the client. The goal is for the server to learn a good model…

We consider a novel distributed learning problem: A server receives potentially unlimited data from clients in a sequential manner, but only a small initial fraction of these data are labeled. Because communication bandwidth is expensive, each client is limited to sending the server only a small (high-priority) fraction of the unlabeled data it generates, and the server is limited in the amount of prioritization hints it sends back to the client. The goal is for the server to learn a good model of all the client data from the labeled and unlabeled data it receives. This setting is frequently encountered in real-world applications and has the characteristics of online, semi-supervised, and active learning. However, previous approaches are not designed for the client-server setting and do not hold the promise of reducing communication costs. We present a novel framework for solving this learning prob- lem in an effective and communication-efficient manner. On the server side, our solution combines two diverse learners working collaboratively, yet in distinct roles, on the partially labeled data stream. A compact, online graph-based semi-supervised learner is used to predict labels for the unlabeled data arriving from the clients. Samples from this model are used as ongoing training for a linear classifier. On the client side, our solution prioritizes data based on an active-learning metric that favors instances that are close to the classifier’s decision hyperplane and yet far from each other. To reduce communication, the server sends the classifier’s weight-vector to the client only periodically. Experimental results on real-world data sets show that this particular combination of techniques outperforms other approaches, and in particular, often outperforms (communication expensive) approaches that send all the data to the server.

Andere Autor:innen
Support Vector Machines under Adversarial Label Contamination

Journal of Neurocomputing, Special Issue on Advances in Learning with Label Noise 21. Juli 2014
Machine learning algorithms are increasingly being applied in security-related tasks such as spam and malware detection, although their security properties against deliberate attacks have not yet been widely understood. Intelligent and adaptive attackers may indeed exploit specific vulnerabilities exposed by machine learning techniques to violate system security. Being robust to adversarial data manipulation is thus an important, additional requirement for machine learning algorithms to…

Machine learning algorithms are increasingly being applied in security-related tasks such as spam and malware detection, although their security properties against deliberate attacks have not yet been widely understood. Intelligent and adaptive attackers may indeed exploit specific vulnerabilities exposed by machine learning techniques to violate system security. Being robust to adversarial data manipulation is thus an important, additional requirement for machine learning algorithms to successfully operate in adversarial settings. In this work, we evaluate the security of Support Vector Machines (SVMs) to well-crafted, adversarial label noise attacks. In particular, we consider an attacker that aims to maximize the SVM’s classification error by flipping a number of labels in the training data. We formalize a corresponding optimal attack strategy, and solve it by means of heuristic approaches to keep the computational complexity tractable. We report an extensive experimental analysis on the effectiveness of the considered attacks against linear and non-linear SVMs, both on synthetic and real-world datasets. We finally argue that our approach can also provide useful insights for developing more secure SVM learning algorithms, and also novel techniques in a number of related research areas, such as semi-supervised and active learning.

Andere Autor:innen
Veröffentlichung anzeigen
From Adversarial Learning to Reliable and Scalable Learning

Dissertation (magna cum laude) Juni 2014

Nowadays machine learning is considered as a vital tool for data analysis and automatic decision making in many modern enterprise systems. However, there is an emerging threat that adversaries can mislead the decision of the learning algo- rithm by introducing security faults into the system. Previous security research did not closely examined the vulnerabilities of the learning algorithms to adversarial manipulations. Understanding these threats is the only way to build robust learn- ing…

Nowadays machine learning is considered as a vital tool for data analysis and automatic decision making in many modern enterprise systems. However, there is an emerging threat that adversaries can mislead the decision of the learning algo- rithm by introducing security faults into the system. Previous security research did not closely examined the vulnerabilities of the learning algorithms to adversarial manipulations. Understanding these threats is the only way to build robust learn- ing algorithms for security-sensitive applications. This dissertation is organized in three parts. Each part contributes the new results in adversarial, reliable and scalable machine learning, respectively.
Efficient Online Sequence Prediction with Side Information

ICDM 2013 12. Dezember 2013
Sequence prediction is a key task in machine learning and data mining. It
involves predicting the next symbol in a sequence given its previous symbols.
Our motivating application is predicting the execution path of a process on an
operating system in real-time. In this case, each symbol in the sequence
represents a system call accompanied with arguments and a return value. We
propose a novel online algorithm for predicting the next system call by
leveraging both…

Sequence prediction is a key task in machine learning and data mining. It
involves predicting the next symbol in a sequence given its previous symbols.
Our motivating application is predicting the execution path of a process on an
operating system in real-time. In this case, each symbol in the sequence
represents a system call accompanied with arguments and a return value. We
propose a novel online algorithm for predicting the next system call by
leveraging both context and side information. The online update of our
algorithm is efficient in terms of time cost and memory consumption.
Experiments on real-world data sets showed that our method outperforms
state-of-the-art online sequence prediction methods in both accuracy and
efficiency, and incorporation of side information does significantly improve
the predictive accuracy.

Andere Autor:innen
Veröffentlichung anzeigen
Lazy Gaussian Process Committee for Real-Time Online Regression

AAAI 2013 7. Juli 2013
A significant problem of Gaussian process (GP) is its
unfavorable scaling with a large amount of data. To overcome this issue, we
present a novel GP approximation scheme for online regression. Our model is
based on a combination of multiple GPs with random hyperparameters. The model is
trained by incrementally allocating new examples to a selected subset of GPs.
The selection is carried out efficiently by optimizing a submodular function.
Experiments on real-world data sets…

A significant problem of Gaussian process (GP) is its
unfavorable scaling with a large amount of data. To overcome this issue, we
present a novel GP approximation scheme for online regression. Our model is
based on a combination of multiple GPs with random hyperparameters. The model is
trained by incrementally allocating new examples to a selected subset of GPs.
The selection is carried out efficiently by optimizing a submodular function.
Experiments on real-world data sets showed that our method outperforms existing
online GP regression methods in both accuracy and efficiency. The applicability
of the proposed method is demonstrated by the mouse-trajectory prediction in an
Internet banking scenario.

Andere Autor:innen
Veröffentlichung anzeigen

Auszeichnungen/Preise

Finalist of Chunhui Venture Competition

Chinese Ministry of Education

Nov. 2016

We build an AI-backed marketplace connecting private investors with their best matching quants and strategies. We are a party venue for quants' intellectual quest and exposition, as well as a rewarding feast to fulfill private investors' financial expectations. Techniques such as backtesting, recommendation, search and chatbot are heavily used.
Runner-up in Quantopian Open Contest #3

Quantopian (see https://www.quantopian.com/leaderboard/3)

Apr. 2015

Developed an online portfolio balancing algorithm that achieves 50% annual return in live paper trading. In April 2015, my algorithm was ranked at 2nd place out of more than 300 algorithms in paper trading.

Details can be found here:
https://www.quantopian.com/leaderboard/3
31337 & Audience Award

Zalando

Dez. 2014

Developed Zketch during Zalando's hack week, a search engine retrieves products to match with a hand-drawn sketch query.

31337: Awarded for the most geeky project in terms of nasty technical challenges solved, extreme difficulty, use of low-level programming language or extreme networking hacking etc.

Audience Award: After every presentation we do a "clapping session" and measure the noise level in dB. The team with the loudest applause wins.
Chinese Government Award for Outstanding Ph.D. Students Abroad

Chinese Ministry of Education

Apr. 2014

The award recognizes the academic excellence of Chinese Ph.D. students studying oversea. It is granted across all fields of study. In 2013, 40 Chinese Ph.D. students in Germany were awarded.

http://www.in.tum.de/fuer-studierende/aktuelles/detail/newsarticle/chinesische-regierung-zeichnet-informatik-doktorand-aus.html
Student travel award

Association for the Advancement of Artificial Intelligence (AAAI), 2013

Juni 2013

This award recognizes the paper "Lazy Gaussian Process Committee for Real-Time Online Regression"
Best paper award

ACM SIGKDD Workshop on Knowledge Discovery, Modeling, and Simulation 2011

Aug. 2011

This award is recognizes the paper "Supervised Topic Transition Model for Detecting Malicious System Call Sequences".

The sponsor of this award is Science Applications International Corporation (SAIC).
ImagineCup Runner-up

Microsoft China

Apr. 2007

Participated ImagineCup software design competition (China region) with Xu Pu. We used C# and .NET techniques and developed a sound-based interface for visual disabilities to gain better control of their PC.

The award is issued by Bill Gates himself at Peking University, 2007.

Organisationen

Association for the Advancement of Artificial Intelligence (AAAI)

Student member

März 2013–Mai 2014

Weitere Aktivitäten von Han Xiao

Can't we just use LLM for reranking? Just throw the 𝐪𝐮𝐞𝐫𝐲, 𝐝𝐨𝐜𝟏, 𝐝𝐨𝐜𝟐,...𝐝𝐨𝐜𝐍 into the context window and let the LLM figure out the…

Can't we just use LLM for reranking? Just throw the 𝐪𝐮𝐞𝐫𝐲, 𝐝𝐨𝐜𝟏, 𝐝𝐨𝐜𝟐,...𝐝𝐨𝐜𝐍 into the context window and let the LLM figure out the…

Geteilt von Han Xiao
An Easter egg hidden on Jina AI website is the use of Reranker for article recommendations. Go to any blog post page, hit 𝐒𝐡𝐢𝐟𝐭+𝟐, and you will…

An Easter egg hidden on Jina AI website is the use of Reranker for article recommendations. Go to any blog post page, hit 𝐒𝐡𝐢𝐟𝐭+𝟐, and you will…

Geteilt von Han Xiao
Jina AI just released a multilingual reranker model for RAG and retrieval. It's quite efficient, and performs well for English and beyond. Sadly, it…

Jina AI just released a multilingual reranker model for RAG and retrieval. It's quite efficient, and performs well for English and beyond. Sadly, it…

Beliebt bei Han Xiao
Today, we are releasing 𝗝𝗶𝗻𝗮 𝗥𝗲𝗿𝗮𝗻𝗸𝗲𝗿 𝘃𝟮 (jina-reranker-v2-base-multilingual), our latest and the most powerful neural reranker model…

Today, we are releasing 𝗝𝗶𝗻𝗮 𝗥𝗲𝗿𝗮𝗻𝗸𝗲𝗿 𝘃𝟮 (jina-reranker-v2-base-multilingual), our latest and the most powerful neural reranker model…

Geteilt von Han Xiao
Thanks, Christian Haug for organizing this two-week event 👏 https://www.zuberlin.city/ It was fun engaging with the curious audience, especially…

Thanks, Christian Haug for organizing this two-week event 👏 https://www.zuberlin.city/ It was fun engaging with the curious audience, especially…

Beliebt bei Han Xiao
Can you find all three oopsies in the picture? I'm back in Berlin after the European AI Startup Matchday in Malmö 🇸🇪. It was a great opportunity to…

Can you find all three oopsies in the picture? I'm back in Berlin after the European AI Startup Matchday in Malmö 🇸🇪. It was a great opportunity to…

Beliebt bei Han Xiao
Jina AI Open Sources Jina CLIP: A State-of-the-Art English Multimodal (Text-Image) Embedding Model Article: https://lnkd.in/gxWGcgaE Jina AI…

Jina AI Open Sources Jina CLIP: A State-of-the-Art English Multimodal (Text-Image) Embedding Model Article: https://lnkd.in/gxWGcgaE Jina AI…

Beliebt bei Han Xiao
Building MuRAG (Multimodal RAG)? We make 𝐉𝐢𝐧𝐚 𝐂𝐋𝐈𝐏 open source and available on Hugging Face! You can now use it via 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬…

Building MuRAG (Multimodal RAG)? We make 𝐉𝐢𝐧𝐚 𝐂𝐋𝐈𝐏 open source and available on Hugging Face! You can now use it via 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬…

Geteilt von Han Xiao

Han Xiaos vollständiges Profil ansehen

Herausfinden, welche gemeinsamen Kontakte Sie haben
Sich vorstellen lassen
Han Xiao direkt kontaktieren

Mitglied werden. um das vollständige Profil zu sehen

Sign in

Stay updated on your professional world

Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.

Neu bei LinkedIn? Mitglied werden

Weitere ähnliche Profile

Weitere Mitglieder namens Han Xiao in Deutschland

Es gibt auf LinkedIn 4 weitere Personen namens Han Xiao, die sich in Deutschland befinden.

Weitere Mitglieder anzeigen, die Han Xiao heißen

Entwickeln Sie mit diesen Kursen neue Kenntnisse und Fähigkeiten

Alle Kurse anzeigen

Info

Aktivitäten

We can all learn from Siemens! Jina AI was invited to the tangere Events by Siemens. We looked at the main challenges they pointed out in…

Beliebt bei Han Xiao

One cool thing about ColBERT-based search vs. the cosine-based vector retrieval is that you get 𝐢𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 for free as a…

Geteilt von Han Xiao

Do you sometimes feel like fighting against your LLM? 🥊 (yes of course you do) Let me introduce you to 𝐀𝐢𝐤𝐢𝐝𝐨–𝐩𝐫𝐨𝐦𝐩𝐭𝐢𝐧𝐠 (paradigm we…

Beliebt bei Han Xiao

Berufserfahrung und Ausbildung

极纳科技

******* *** ***

Gesamte Berufserfahrung von Han Xiao anzeigen

Jobbezeichnung, Beschäftigungsdauer und mehr ansehen.

Veröffentlichungen

Learning Better while Sending Less: Communication-Efficient Online Semi-Supervised Learning in Client-Server Settings

IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015) 22. Juli 2015

Support Vector Machines under Adversarial Label Contamination

Journal of Neurocomputing, Special Issue on Advances in Learning with Label Noise 21. Juli 2014

From Adversarial Learning to Reliable and Scalable Learning

Dissertation (magna cum laude) Juni 2014

Efficient Online Sequence Prediction with Side Information

ICDM 2013 12. Dezember 2013

Lazy Gaussian Process Committee for Real-Time Online Regression

AAAI 2013 7. Juli 2013

Auszeichnungen/Preise

Finalist of Chunhui Venture Competition

Chinese Ministry of Education

Runner-up in Quantopian Open Contest #3

Quantopian (see https://www.quantopian.com/leaderboard/3)

31337 & Audience Award

Zalando

Chinese Government Award for Outstanding Ph.D. Students Abroad

Chinese Ministry of Education

Student travel award

Association for the Advancement of Artificial Intelligence (AAAI), 2013

Best paper award

ACM SIGKDD Workshop on Knowledge Discovery, Modeling, and Simulation 2011

ImagineCup Runner-up

Microsoft China

Organisationen

Association for the Advancement of Artificial Intelligence (AAAI)

Student member

Weitere Aktivitäten von Han Xiao

Can't we just use LLM for reranking? Just throw the 𝐪𝐮𝐞𝐫𝐲, 𝐝𝐨𝐜𝟏, 𝐝𝐨𝐜𝟐,...𝐝𝐨𝐜𝐍 into the context window and let the LLM figure out the…

Geteilt von Han Xiao

An Easter egg hidden on Jina AI website is the use of Reranker for article recommendations. Go to any blog post page, hit 𝐒𝐡𝐢𝐟𝐭+𝟐, and you will…

Geteilt von Han Xiao

Jina AI just released a multilingual reranker model for RAG and retrieval. It's quite efficient, and performs well for English and beyond. Sadly, it…

Beliebt bei Han Xiao

Today, we are releasing 𝗝𝗶𝗻𝗮 𝗥𝗲𝗿𝗮𝗻𝗸𝗲𝗿 𝘃𝟮 (jina-reranker-v2-base-multilingual), our latest and the most powerful neural reranker model…

Geteilt von Han Xiao

Thanks, Christian Haug for organizing this two-week event 👏 https://www.zuberlin.city/ It was fun engaging with the curious audience, especially…

Beliebt bei Han Xiao

Can you find all three oopsies in the picture? I'm back in Berlin after the European AI Startup Matchday in Malmö 🇸🇪. It was a great opportunity to…

Beliebt bei Han Xiao

Jina AI Open Sources Jina CLIP: A State-of-the-Art English Multimodal (Text-Image) Embedding Model Article: https://lnkd.in/gxWGcgaE Jina AI…

Beliebt bei Han Xiao

Building MuRAG (Multimodal RAG)? We make 𝐉𝐢𝐧𝐚 𝐂𝐋𝐈𝐏 open source and available on Hugging Face! You can now use it via 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬…

Geteilt von Han Xiao

Han Xiaos vollständiges Profil ansehen

Sign in

Weitere ähnliche Profile

Jonas Andrulis

Bing He

Thomas Wolf

Florian Hönicke

Saahil Ognawala

Max Tegmark

Zhen Wang

Joan Fontanals Martínez

Sofia V.

Deepankar Mahapatro

Weitere Mitglieder namens Han Xiao in Deutschland

Han Xiao

Han Xiao

Han Xiao

肖寒

Entwickeln Sie mit diesen Kursen neue Kenntnisse und Fähigkeiten

Deep Learning: Model Optimization and Tuning