this post was submitted on 11 Sep 2024
68 points (95.9% liked)
Data is Beautiful
1185 readers
1 users here now
Be respectful
founded 5 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yeah I've noticed there aren't many clusters that encode specific ideas (there are a few like the anime, nsfw, or sometimes instance level clusters). Most of it just seems to be a blend. Sorta disappointing.
~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~
Are they clustered based on shared userbase?
Yeah pretty much. There is also a weighting based on the percentage of comments in that community that come from that user.
~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~
There's not enough data yet for the noise to cancel itself out, I think.
Place and language-specific clusters are pretty coherent, if you go looking.