A generational gap on Wikipedia - 91% of WP admins started editing before 2010

antonim@lemmy.dbzer0.com · 3 hours ago

Chiquita and Nestlé come to mind. Within tech industry, I’d say Amazon and probably Microsoft are worse as well, and there’s probably a ton of potentially even worse companies lurking in the shadows outside the top of the economic food chain.

antonim@lemmy.dbzer0.com · 15 hours ago

I’m worrying that whatever gets sold (Chrome or Android) might end up in the hands of someone even more scummy than Google.

antonim@lemmy.dbzer0.com · 16 days ago

Yeah, totally makes sense, “they” attacked IA one month in advance before the elections, knowing that IA would spend around a month rewriting and improving their site code until the Save Page option would be enabled again (unless IA themselves are a part of the plot???), so that news articles could be “edited on the fly” (with what result?) until the election day, while other similar web archiving services such as archive.is would keep working just fine.

antonim@lemmy.dbzer0.com · 21 days ago

And that’s more or less what I was aiming for, so we’re back at square one. What you wrote is in line with my first comment:

it is a weak compliment for AI, and more of a criticism of the current web search engines

The point is that there isn’t something that makes AI inherently superior to ordinary search engines. (Personally I haven’t found AI to be superior at all, but that’s a different topic.) The difference in quality is mainly a consequence of some corporate fuckery to wring out more money from the investors and/or advertisers and/or users at the given moment. AI is good (according to you) just because search engines suck.

antonim@lemmy.dbzer0.com · 22 days ago

AI LLMs simply are better at surfacing it

Ok, but how exactly? Is there some magical emergent property of LLMs that guides them to filter out the garbage from the quality content?

antonim@lemmy.dbzer0.com · 24 days ago

If you don’t feel like discussing this and won’t do anything more than deliberately miss the point, you don’t have to reply to me at all.

antonim@lemmy.dbzer0.com · 24 days ago

they’re a great use in surfacing information that is discussed and available, but might be buried with no SEO behind it to surface it

This is what I’ve seen many people claim. But it is a weak compliment for AI, and more of a criticism of the current web search engines. Why is that information unavailable to search engines, but is available to LLMs? If someone has put in the work to find and feed the quality content to LLMs, why couldn’t that same effort have been invested in Google Search?

antonim@lemmy.dbzer0.com · 24 days ago

Admittedly that sort of censoring has been used online since forever. Stuff like “pr0n”, etc.

antonim@lemmy.dbzer0.com · 1 month ago

Are you a bot? Or just lazy?

I am a bot. Beep boop.

antonim@lemmy.dbzer0.com · 1 month ago

Also, the first woman? Props to her but I’m quite surprised no one else has done that

Yeah, it’s indeed false. I didn’t even research it actively, but Wilson on her Twitter profile mentioned an Italian translator who translated Homer years before Wilson.

(To be sure, I just checked Italian Wikipedia. It was Giovanna Bemporad, her translation was published in 1970.)

antonim@lemmy.dbzer0.com · 1 month ago

deleted by creator

antonim@lemmy.dbzer0.com · 1 month ago

Here in my southeast European shithole I’m not worrying about my tax money, the upgrade is going to be pretty cheap, they’re just going to switch from unlicensed XP to unlicensed Win7.

antonim@lemmy.dbzer0.com · 1 month ago

Yep, but I didn’t mention that because it’s not a part of the “Wayback Machine”, it’s just the general “Internet Archive” business of archiving media, which is for now still completely unavailable. (I’ve uploaded dozens of public-domain books there myself, and I’m really missing it…)

antonim@lemmy.dbzer0.com · 1 month ago

You can (well, could) put in any live URL there and IA would take a snapshot of the current page on your request. They also actively crawl the web and take new snapshots on their own. All of that counts as ‘writing’ to the database.

antonim@lemmy.dbzer0.com · 1 month ago

I don’t get the impression you’ve ever made any substantial contributions to Wikipedia, and thus have misguided ideas about what would be actually helpful to the editors and conductive to producing better articles. Your proposal about translations is especially telling, because the machine-assisted translations (i.e. with built-in tools) have already existed on WP long before the recent explosion of LLMs.

In short, your proposals either: 1. already exist, 2. would still risk distorsion, oversimplification, made-up bullshit and feedback loops, 3. are likely very complex and expensive to build, or 4. are straight up impossible.

Good WP articles are written by people who have actually read some scholarly articles on the subject, including those that aren’t easily available online (so LLMs are massively stunted by default). Having an LLM re-write a “poorly worded” article would at best be like polishing a turd (poorly worded articles are usually written by people who don’t know much about the subject in the first place, so there’s not much material for the LLM to actually improve), and more likely it would introduce a ton of biases on its own (as well as the usual asinine writing style).

Thankfully, as far as I’ve seen the WP community is generally skeptical of AI tools, so I don’t expect such nonsense to have much of an influence on the site.

antonim@lemmy.dbzer0.com · 1 month ago

As far as Wikipedia is concerned, there is pretty much no way to use LLMs correctly, because probably each major model includes Wikipedia in its training dataset, and using WP to improve WP is… not a good idea. It probably doesn’t require an essay to explain why it’s bad to create and mechanise a loop of bias in an encyclopedia.

antonim@lemmy.dbzer0.com · edit-2 2 months ago

It has custom user-made themes that are dark mode, so it probably has dozens of dark modes.

antonim@lemmy.dbzer0.com · 2 months ago

As I notice this comment is satirical, unlike the (currently) 49 plebeian downvoters, I feel my massive genius brain undulating and pressing upon my skull.

antonim@lemmy.dbzer0.com · 3 months ago

Yeah I’m wondering as well. It seems to save webpages, whereas the issue is with scanned books which may be removed from IA…

antonim@lemmy.dbzer0.com · edit-2 3 months ago

So child porn is okay then? You would already have it on your system

You’d have to look for it, knowing fully well that it is illegal to produce in the first place and distribute to others, access it online, and then deliberately retain it. It’s not really the same as something that’s legal to produce and distribute (it is certainly legal for me to view your site). You wouldn’t “already” have it.

I doubt you are either.

Well I’ve read some copyright laws, had to solve some issues regarding usage of copyrighted works, etc. Nothing that makes me an expert, but I’m not talking wholly out of my ass either.

It does… on paper… A lot. https://time.com/6266147/internet-archive-copyright-infringement-books-lawsuit/ To the point it’s losing lawsuits over exactly that.

That’s not Wayback Machine per se, that’s Internet Archive’s book scanning and “digital lending” system, which was most definitely doing legally questionable (and stupid) things even to an amateur eye. However, Wayback Machine making read-only copies of websites has for now never been disputed successfully.

antonim@lemmy.dbzer0.com · 1 year ago

A generational gap on Wikipedia - 91% of WP admins started editing before 2010