Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Typically you have (1) an ML model to generate embeddings from the items in your db and (2) vector support in the db to order/compare those rows by similarity (i.e. cosine distance between embeddings). So this just gives you the second part in a more convenient / efficient package. That's super cool, but only the second half.

For the encoding model, you could use any ML model you want, from cheap/less complex models to expensive/incredibly complex models like GPT-3. You could even use a face recognition model to encode face images and sort/compare them the same, etc

So this just makes it a lot easier to roll your own similarity systen with an encoding model of your choice plugged in. If you have a lot of data to encode and aren't afraid of running your own model, it is a great part of a solution. But it is not an all-in-one solution.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: