Search help assistant

Context and problematic

Kairos is a research assistant.
Objective : Take a user request, enrich, and crawl the results while presenting the top matches, their similarity, and their content (topic extraction)

Goals

The goal is to be able to select the most interesting content to read, then be able to navigate through the documents, understand their subjects and reports (topic/similarity clustering).

Our intervention

1 Data Scientist

  • The work on similarity is done through the comparison of word & document embeddings.
  • To compare to the query, embedding is carried out on the words and calculates the cumulative energy, which is required to translate the words of the query to the document’s matching words.
  • Clustering through embedding is consistent, working on extracting the most important sentences makes it possible to separate the texts, even more so by removing unnecessary sentences.
  • Regarding the topic extraction, in the long run, a classification with multilabel topics will be much sturdier and more efficient.
  • Using an unsupervised generative algorithm such as the LDA speeds up this labeling.

Results

All features have been developed.
The project has achieved all the defined objectives and is now in the industrialization phase.

Technical environment

Python, Keras, Tensorflow, Gensim, Spacy, Nltk, Docker, Scrapy, BeautifulSoup, Git

Together with our customers, we build solutions that change and facilitate their daily lives.

Aide à la création de médicaments

Plateforme d’analyse de besoins clients

Conception et industrialisation du SI analytics

Prédiction de retards

Analyse de visage pour recommandation produits

Application d’optimisation de la Supply Chain

Scoring et analyse
de la peau

Analyse de Forums

Personnalisation de contenu

Analyse des activités de support IT

Détection de tendances sur les réseaux sociaux

Détection
de beaconing

Outil de classification de documents

Détection de cancer via Deep Learning

Conception de plateforme de veille stratégique

Rendements
des champs agricoles

Conception du Data Hub et implémentation

Analyse et prévention des problèmes Skype

Assistant d’aide à la recherche

Classification de pages Web