Skip to content

Related content using embeddings #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
simonw opened this issue Aug 15, 2023 · 7 comments
Closed

Related content using embeddings #79

simonw opened this issue Aug 15, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Aug 15, 2023

I'm writing a TIL about this as I go.

@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

The first time this ran successfully took this long:

image

If I got the code right it should be a lot faster on subsequent runs, since it will only calculate similarities for stuff related to the latest added entries.

@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

Here's the new SQL query: https://til.simonwillison.net/tils?sql=select%0D%0A++til.topic%2C+til.slug%2C+til.title%2C+til.created%0D%0Afrom+til%0D%0Ajoin+similarities+on+til.path+%3D+similarities.other_id%0D%0Awhere+similarities.id+%3D+%27python_pyproject.md%27%0D%0Aorder+by+similarities.score+desc+limit+10

select
  til.topic, til.slug, til.title, til.created
from til
join similarities on til.path = similarities.other_id
where similarities.id = 'python_pyproject.md'
order by similarities.score desc limit 10

@simonw simonw closed this as completed in 07f43b6 Aug 15, 2023
@simonw simonw added the enhancement New feature or request label Aug 15, 2023
@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

Didn't quite work:

Skipped 446 rows that already existed
[{"id": "svg_dynamic-line-chart.md", "other_id": "observable-plot_wider-tooltip-areas.md", "score": 0.7923009914460658}]
error: Could not access 'HEAD~10'
Error: Must specify entries or --all

@simonw simonw reopened this Aug 15, 2023
simonw added a commit that referenced this issue Aug 15, 2023
Also bumped version on some actions
@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

OK, the second time that ran it took 8s, which is as hoped for:

image

@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

simonw added a commit to simonw/openai-to-sqlite that referenced this issue Aug 15, 2023
@simonw
Copy link
Owner Author

simonw commented Aug 15, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant