Danterious

joined 1 year ago
[–] Danterious@lemmy.dbzer0.com 2 points 2 hours ago

Yeah pretty much. There is also a weighting based on the percentage of comments in that community that come from that user.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 2 points 19 hours ago (1 children)

I don't think it was included because there were no new comments made after august 1.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

I had to try scraping the websites multiple times because of stupid bugs I put in the code beforehand, so I might of put more strain on the instances than I meant too. If I did this again it would hopefully be much less tolling on the servers.

As for the cost of scraping it actually isn't that hard I just had it running in the background most of the time.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 5 points 1 day ago (3 children)

Yeah I've noticed there aren't many clusters that encode specific ideas (there are a few like the anime, nsfw, or sometimes instance level clusters). Most of it just seems to be a blend. Sorta disappointing.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 3 points 1 day ago (2 children)

Probably a webgl problem. I had to use ungoogled chromium to open the page. I think it works on regular firefox too.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

Yeah that sounds like a good idea so you can see how connected local communities are. Probably makes more sense to use original dimensions so no extra information is lost.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 10 points 1 day ago

Something that I find interesting is how close the central clusters of beehaw.org, slrpnk.net, and lemmy.blahaj.zone are together. If you only highlight those instances then you see how close their communities tend to be.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 11 points 1 day ago* (last edited 1 day ago) (2 children)

Total communities: 2986

Total users: 21934

So the dimensions were reduced from (2986, 21934) to (2986, 2)

Edit: Also yeah it is using Umap for the algorithm and it does do something pretty similar to what you described.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

I was somehow able to get both a picture and url added and it looks much better. Thx.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

24
Map of 2000+ lemmy communities (danterious.codeberg.page)
submitted 1 day ago* (last edited 1 day ago) by Danterious@lemmy.dbzer0.com to c/chat@beehaw.org
 

cross-posted from: https://lemmy.dbzer0.com/post/27579423

This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

[–] Danterious@lemmy.dbzer0.com 3 points 1 day ago (1 children)

If I can figure that out I would definitely do that.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] Danterious@lemmy.dbzer0.com 27 points 1 day ago* (last edited 1 day ago)

Either the people in !steamdeck@lemmy.world are pretty horny or its an artifact of the dimensionality reduction and means nothing.

Edit: Actually it could also be that it just didn't collect enough data on that community and the most recent person was also active in nsfw communities. I was only able to get back 14ish days in the data for lemmy.world. They produce way to many comments and I got kicked out early.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

57
Map of 2000+ lemmy communities (danterious.codeberg.page)
submitted 1 day ago* (last edited 1 day ago) by Danterious@lemmy.dbzer0.com to c/fediverse@lemmy.ml
 

cross-posted from: https://lemmy.dbzer0.com/post/27579423

This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

181
Map of 2000+ lemmy communities (danterious.codeberg.page)
submitted 1 day ago* (last edited 1 day ago) by Danterious@lemmy.dbzer0.com to c/fediverse@lemmy.world
 

This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

 

cross-posted from: https://lemmy.dbzer0.com/post/27216373

Instead of focusing of creating good algorithms to push certain content to users why don't we focus on creating a good map that allows users to find the kind of content they want more easily?

I found this website that created a map of reddit with different countries for different topics and I thought it would translate to lemmy because instances sort of do this already really well.

https://anvaka.github.io/map-of-reddit/

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

 

Instead of focusing of creating good algorithms to push certain content to users why don't we focus on creating a good map that allows users to find the kind of content they want more easily?

I found this website that created a map of reddit with different countries for different topics and I thought it would translate to lemmy because instances sort of do this already really well.

https://anvaka.github.io/map-of-reddit/

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

 

cross-posted from: https://lemmy.dbzer0.com/post/25287498

cross-posted from: https://lemmy.zip/post/19638259

There are about 6 pages.dev domains spamming lemmy.world communities

The volume is definitely inorganic, and is across a wide range of communities

pages.dev is Cloudflare's site hosting which can be used for free - there are likely many legitimate sites that use that domain, but the current flood is suspicious

chronicleresolve.pages.dev

thefreedomproject.pages.dev

versarch.pages.dev

dailypulse.pages.dev

newssphere-6fu.pages.dev

iniko.pages.dev

miniza.pages.dev

orino.pages.dev

I'm cross posting because @lenny_marlane@lemmy.ml seems to be doing the same thing.

It might be an attack vector or something idk but better safe than sorry.

Not sure about this one but seems to be following same pattern.

@marvelous_coyote@lemm.ee

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

 

cross-posted from: https://lemmy.dbzer0.com/post/25357952

I saw this and thought this would be useful in noticing and analyzing trends across the web and fediverse in specific. Which could help with noticing and finding disinformation.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

 

I saw this and thought this would be useful in noticing and analyzing trends across the web and fediverse in specific. Which could help with noticing and finding disinformation.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

view more: next ›