• 1 Post
  • 753 Comments
Joined 3 years ago
cake
Cake day: August 21st, 2023

help-circle



  • I think you are underestimating how accurate LLMs are because you probably don’t use them much, and only see there mistakes posted for memes. No one’s going to post the 99 times an LLM gives the correct answer, but the one time it says to put glue on pizza it’s going to go viral. So if your only view on LLM output is from posts, you’re going to think it’s way worse than it is.

    And look at what is on my feed just this morning: https://lemmy.world/post/44099386

    It’s not just that LLMs are shit. It’s that people trust them way too much and are shocked when the predictable happens.

    Even if you mark it down for incorrect answers it’s still going to beat most people. An LLM can score in the 90th percentile in the SAT, and around the 80th percentile in the LSAT.

    And of course the AI bro goes for the “vibes” argument. You can’t just state that as true without providing a source. Or did AI tell you it was true?

    For example: fewer than 10% of tested AIs consistently properly answered that you need to drive to a car wash in order to wash your car: https://opper.ai/blog/car-wash-test

    That’s a question so far below anything on the SAT or LSAT and 90% of LLMs can’t even get that right.

    If you’re doubting my percentages on the accuracy of LLMs I’d encourage you to test them yourself.

    I’ve tried using LLMs. I don’t use them for research, because why the fuck would I? Better, more efficient tools already exist for that. When I had something that a search engine can’t help me with and LLMs are apparently “good at” it immediately proved itself to be worthless.



  • An LLM will give more accurate declaritive statements on more question then any human can

    Not if you include “I don’t know” as an accurate statement or penalize the score for incorrect declarative statements.

    So is it not more trustworthy for giving declaritive statements than any random human? Would you not trust an LLMs answer on who the 4th president is over a random human?

    I would absolutely trust the random human more because they’re not going to make shit up if they don’t know. It will either be “I don’t know” or “I would guess” to make it clear they aren’t confident. The LLM will give me a declarative answer but I have no fucking clue if it’s accurate or an “hallucination” (lie). I’ll need to do what I should have done in the first place and ask a search engine to make sure.


  • they have good declaritive knowledge

    No. They don’t. They are good at making declarative statements.
    That’s not the same thing.

    Every day you also probably see a new post of humans being blatantly wrong, does that mean humans can’t know things?

    I fully agree that asking a random human for help with something is just as effective as asking an LLM to help with something.

    If I need to know something (like who was the first president of the United States) I will not go outside and ask a random human, I will ask a trustworthy source.
    If I need some code written I won’t have a random human do it, I will interview people to find someone capable.
    If I need someone to interact with customers I won’t let some random human come in and do it.




  • Whether an LLM can determine truth depends on your definition of truth

    Of course someone who doesn’t believe “truth” exists thinks LLMs are just fine. You have to not believe things can be true in order to find their output acceptable.

    An LLM can derive this sort of truth by determining the consensus of its training data assuming its training data is from trustworthy sources or the more trustworthy sources are more reinforced.

    Every week I see a new post of an LLM being blantly wrong. LLMs said to add glue to pizza to make the cheese stick together.

    “They have improved the models since then…” Last week the American military used “AI” and it targeted a school as a military structure. The models are full of shit, they just manually remove the blantly incorrect shit whenever they make the rounds, and there’s always more blantly incorrect shit to be found.