A map of the latest 11 million papers split by semantic similarity and time slices [P]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
| I have building alternative ways explore scientifc literature. The goal was to make the large number of papers published daily easier to keep up with by visualising the macro scopic trend. It is free to use at The Global Research Space for any one interested in giving it a try! How I built it I sourced the latest 11M papers from OpenAlex and Arxiv and ecoded them using SPECTER 2 on titles and abstracts then projecting it down to 2d using UMAP and creating labels within voronoi bounds around high density peaks at increasingly deep depths. There is also support for both keyword and semantic queries, and there's an analytics layer for ranking institutions, authors, and topics etc. I have also more recently added to ability to slide back and forth in time and a daily auto ingestion script to ensure the map is up to date. Feedback or suggestions is very welcome! [link] [comments] |
More from r/MachineLearning
-
Update on CVIL: the free CV interview prep checklist after landing my internship... just added Segmentation, OCR, and VLM sections [D]
Jun 30
-
EACL 2027: Author response and author-reviewer discussion are now two separate stages and allow more time [D]
Jun 30
-
Loss functions in Instance Representation Learning [R]
Jun 29
-
Price elasticity model [R]
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.