Who needs human when you have AI :p

  • Dr. Jenkem@lemmy.blugatch.tube
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I think you misunderstand the problem. Sure it starts with small amounts of output fed into the input, but as it continues to generate large amounts of output, overtime, more and more of the output makes it into the input.

    And again, limiting LLMs to pre-2023 training data ensures they never get smarter. Human knowledge expands as LLMs at best are locked into a constant state of 2023 knowledge.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Sure it starts with small amounts of output fed into the input, but as it continues to generate large amounts of output, overtime, more and more of the output makes it into the input.

      Not inevitably. You’re assuming that each “generation” of AI is being trained on a data set that’s just blindly harvested. AI trainers are already spending a huge amount of effort curating their training sets, it’s become quite apparent that the quality of the training set is important and you can’t just dump a giant raw pile of everything into it to get good results. This would just be another thing for them to consider.

      • Dr. Jenkem@lemmy.blugatch.tube
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        To a certain extent, yes, the training data is blindly being dumped in. There’s no way terabytes of training data is being manually reviewed for accuracy. If for no other reason, it doesn’t economically make sense to do so. It’s simply not feasible for humans to manually currate all of that data and even if they did, human error still exists.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          Your disbelief doesn’t mean it’s not happening. The data sources that go into AIs are indeed curated selectively. Honestly, what do you think happens, a webcrawler is told to just “go nuts” and whatever random data it spits out gets fed right in? Trainers pick their sources carefully. They deduplicate it, they format it, they do a lot of work on it.

          Perfection is not required. Human error is fine in manageable amounts.