• bitsplease@lemmy.ml
    link
    fedilink
    English
    arrow-up
    28
    ·
    9 months ago

    One thing I’d be interested in is getting a self assessment from each person regarding how good they believe themselves to have been at picking out the fakes.

    I already see online comments constantly claiming that they can “totally tell” when an image is AI or a comment was chatGPT, but I suspect that confirmation bias plays a big part than most people suspect in how much they trust a source (the classic “if I agree with it, it’s true, if I don’t, then it’s a bot/shill/idiot”)

    • ILikeBoobies@lemmy.ca
      link
      fedilink
      English
      arrow-up
      4
      ·
      9 months ago

      With the majority being in CS fields and having used ai image generation before they likely would be better at picking out than the average person

      • bitsplease@lemmy.ml
        link
        fedilink
        English
        arrow-up
        7
        ·
        9 months ago

        You’d think, but according to OP they were basically the same, slightly worse actually, which is interesting

        • ILikeBoobies@lemmy.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          9 months ago

          The ones using image generation did slightly better

          I was more commenting it to point out that it’s not necessary to find that person who can totally tell because they can’t

      • lloram239@feddit.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        Even when you know what you are looking for, you are basically pixel hunting for artifacts or other signs that show it’s AI without the image actually looking fake, e.g. the avocado one was easy to tell, as ever since DALLE1 avocado related things have been used as test images, the https://thispersondoesnotexist.com/ one was obvious due to how it was framed and some of the landscapes had that noise-vegetation-look that AI images tend to have. But none of the images look fake just by themselves, if you didn’t specifically look for AI artifacts, it would be impossible to tell the difference or even notice that there is anything wrong with the image to begin with.

    • Spzi@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      Right? A self-assessed skill which is never tested is a funny thing anyways. It boils down to “I believe I’m good at it because I believe my belief is correct”. Which in itself is shady, but then there are also incentives that people rather believe to be good, and those who don’t probably rather don’t speak up that much. Personally, I believe people lack the competence to make statements like these with any significant meaning.

  • yokonzo@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    arrow-down
    1
    ·
    edit-2
    9 months ago

    One thing I’m not sure if it skews anything, but technically ai images are curated more than anything, you take a few prompts, throw it into a black box and spit out a couple, refine, throw it back in, and repeat. So I don’t know if its fair to say people are getting fooled by ai generated images rather than ai curated, which I feel like is an important distinction, these images were chosen because they look realistic

    • popcar2@programming.devOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      9 months ago

      Technically you’re right but the thing about AI image generators is that they make it really easy to mass-produce results. Each one I used in the survey took me only a few minutes, if that. Some images like the cat ones came out great in the first try. If someone wants to curate AI images, it takes little effort.

  • erwan@lemmy.ml
    link
    fedilink
    English
    arrow-up
    26
    ·
    9 months ago

    So if the average is roughly 10/20, that’s about the same as responding randomly each time, does that mean humans are completely unable to distinguish AI images?

    • SkyNTP@lemmy.ml
      link
      fedilink
      English
      arrow-up
      14
      ·
      9 months ago

      In theory, yes. In practice, not necessarily.

      I found that the images were not very representative of typical AI art styles I’ve seen in the wild. So not only would that render preexisting learned queues incorrect, it could actually turn them into obstacles to guessing correctly pushing the score down lower than random guessing (especially if the images in this test are not randomly chosen, but are instead actively chosen to dissimulate typical AI images).

      • Rolder@reddthat.com
        link
        fedilink
        English
        arrow-up
        5
        ·
        9 months ago

        I would also think it depends on what kinds of art you are familiar with. If you don’t know what normal pencil art looks like, how are ya supposed to recognize the AI version.

        As an example, when I’m browsing certain, ah, nsfw art, I can recognize the AI ones no issue.

    • Hamartiogonic@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      4
      ·
      9 months ago

      If you look at the ratios of each picture, you’ll notice that there are roughly two categories: hard and easy pictures. Based on information like this, OP could fine tune a more comprehensive questionnaire to include some photos that are clearly in between. I think it would be interesting to use this data to figure out what could make a picture easy or hard to identify correctly.

      My guess is that a picture is easy if it has fingers or logical structures such as text, railways, buildings etc. while illustrations and drawings could be harder to identify correctly. Also, some natural structures such as coral, leaves and rocks could be difficult to identify correctly. When an AI makes mistakes in those areas, humans won’t notice them very easily.

      The number of easy and hard pictures was roughly equal, which brings the mean and median values close to 10/20. If you want to bring that value up or down, just change the number of hard to identify pictures.

    • KairuByte@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      It depends on if these were hand picked as the most convincing. If they were, this can’t be used a representative sample.

    • porkchop@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      Personally, I’m not surprised. I thought a 3D dancing baby was real.

    • doggle@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      From this particular set, at least. Notice how people were better at guessing some particular images.

      Stylized and painterly images seem particularly hard to differentiate.

  • doggle@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    22
    ·
    9 months ago

    Having used stable diffusion quite a bit, I suspect the data set here is using only the most difficult to distinguish photos. Most results are nowhere near as convincing as these. Notice the lack of hands. Still, this establishes that AI is capable of creating art that most people can’t tell apart from human made art, albeit with some trial and error and a lot of duds.

    • bitsplease@lemmy.ml
      link
      fedilink
      English
      arrow-up
      13
      ·
      9 months ago

      Idk if I’d agree that cherry picking images has any negative impact on the validity of the results - when people are creating an AI generated image, particularly if they intend to deceive, they’ll keep generating images until they get one that’s convincing

      At least when I use SD, I generally generate 3-5 images for each prompt, often regenerating several times with small tweaks to the prompt until I get something I’m satisfied with.

      Whether or not humans can recognize the worst efforts of these AI image generators is more or less irrelevant, because only the laziest deceivers will be using the really obviously wonky images, rather than cherry picking

      • lloram239@feddit.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        9 months ago

        AI is only good at a subset of all possible images. If you have images with multiple people, real world products, text, hands interacting with stuff, unusual posing, etc. it becomes far more likely that artifacts slip in, often times huge ones that are very easy to spot. For example even DALLE-3 can’t generate a realistic looking N64. It will generate something that looks very N64’ish and gets the overall shape right, but is wrong in all the little details, the logo is distorted, the ports have the wrong shape, etc.

        If you spend a lot of time inpainting and manually adjusting things, you can get rid of some of the artifacts, but at that point you aren’t really AI generating images anymore, but just using AI as source for photoshopping. If you just using AI and pick the best images, you will end up with a collection of images that all look very AI’ish, since they will all feature very similar framing, posing, layout, etc. Even so no individual image might not look suspicious by themselves, when you have a large number of them they always end up looking very similar, as they don’t have the diversity that human made images have and don’t have the temporal consistency.

    • blueberrypie@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      9 months ago

      These images were fun, but we can’t draw any conclusions from it. They were clearly chosen to be hard to distinguish. It’s like picking 20 images of androgynous looking people and then asking everyone to identify them as women or men. The fact that success rate will be near 50% says nothing about the general skill of identifying gender.

  • Funderpants @lemmy.ca
    link
    fedilink
    English
    arrow-up
    19
    ·
    edit-2
    9 months ago

    Wow, what a result. Slight right skew but almost normally distributed around the exact expected value for pure guessing.

    Assuming there were 10 examples in each class anyway.

    It would be really cool to follow up by giving some sort of training on how to tell, if indeed such training exists, then retest to see if people get better.

    • WalrusDragonOnABike@kbin.social
      link
      fedilink
      arrow-up
      4
      ·
      9 months ago

      Imo, 3,17,18 were obviously AI imo (based on what I’ve seen from AI art generators in the past*). But whatever original art those are based on, I’d probably also flag as obviously AI. The rest I was basically guessing at random. Especially the sketches.

      *I never used AI generators myself, but I’ve seen others do it on stream. Curious how many others like me are raising the average for the “people that haven’t used AI image generators” before.

    • gullible@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      9 months ago

      I was legitimately surprised by the man on a bench being human-made. His ankle is so thin! The woman in a bar/restaurant also surprised me because of her tiny finger.

  • squirrelwithnut@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    9 months ago

    Sketches are especially hard to tell apart because even humans put in extra lines and add embellishments here and there. I’m not surprised more than 70% of participants weren’t able to tell that one was generated.

  • AVincentInSpace@pawb.social
    link
    fedilink
    English
    arrow-up
    8
    ·
    edit-2
    9 months ago

    Something I’d be interested in is restricting the “Are you in computer science?” question to AI related fields, rather than the whole of CS, which is about as broad a field as social science. Neural networks are a tiny sliver of a tiny sliver

    • doctorcrimson@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      Especially depending on the nation or district a person lives in, where CS can have even broader implications like everything from IT Support to Engineering.

  • crawley@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    9 months ago

    It’d be interesting to also see the two non-AI images the most people thought were.

  • rbn@feddit.ch
    link
    fedilink
    English
    arrow-up
    5
    ·
    9 months ago

    Thank you so much for sharing the results. Very interesting to see the outcome after participating in the survey.

    Out of interest: do you know how many participants came from Lemmy compared to other platforms?

  • lenz@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 months ago

    I got a 17/20, which is awesome!

    I’m angry because I could’ve gotten an 18/20 if I’d paid attention to the thispersondoesnotexists’ glasses, which in hindsight, are clearly all messed up.

    I did guess that one human-created image was made by AI, “The End of the Journey”. I guessed that way because the horses had unspecific legs and no tails. And also, the back door of the cart they were pulling also looked funky. The sky looked weirdly detailed near the top of the image, and suddenly less detailed near the middle. And it had birds at the very corner of the image, which was weird. I did notice the cart has a step-up stool thing attached to the door, which is something an AI likely wouldn’t include. But I was unsure of that. In the end, I chose wrong.

    It seems the best strategy really is to look at the image and ask two questions:

    • what intricate details of this image are weird or strange?
    • does this image have ideas indicate thought was put into them?

    About the second bullet point, it was immediately clear to me the strawberry cat thing was human-made, because the waffle cone it was sitting in was shaped like a fish. That’s not really something an AI would understand is clever.

    One the tomato and avocado one, the avocado was missing an eyebrow. And one of the leaves of the stem of the tomato didn’t connect correctly to the rest. Plus their shadows were identical and did not match the shadows they would’ve made had a human drawn them. If a human did the shadows, it would either be 2 perfect simplified circles, or include the avocado’s arm. The AI included the feet but not the arm. It was odd.

    The anime sword guy’s armor suddenly diverged in style when compared to the left and right of the sword. It’s especially apparent in his skirt and the shoulder pads.

    The sketch of the girl sitting on the bench also had a mistake: one of the back legs of the bench didn’t make sense. Her shoes were also very indistinct.

    I’ve not had a lot of practice staring at AI images, so this result is cool!

    • tigeruppercut@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      does this image have ideas indicate thought was put into them?

      I got fooled by the bright mountain one. I assumed it was just generic art vomit a la Kinkade

    • Syrc@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      About the second bullet point, it was immediately clear to me the strawberry cat thing was human-made, because the waffle cone it was sitting in was shaped like a fish. That’s not really something an AI would understand is clever.

      It’s a (Taiyaki cone)[https://japancrate.com/cdn/shop/articles/275b4c92-2649-4e94-896f-4bc3ce4acbad_1200x1200.png?v=1678715822], something that already exists. Wouldn’t be too hard to get AI to replicate it, probably.

      I personally thought the stuff hanging on the side was oddly placed and got fooled by it.