• keepthepace@slrpnk.net
    link
    fedilink
    arrow-up
    1
    ·
    10 months ago

    My problem is that reconstructions by architects are produced differently, using more information, than Hollywood-like DALL-E outputs. The first one is made to be informative, about the size and aspect of the buildings, the urban organization, etc. The second one is just designed to look good.

    • lanolinoil@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Not entirely – This uses controlnet to mimic the lines and shapes in the original image – It’s finicky getting the settings right so it works well in all cases, but this is more than just a straight img2img conversion or DALLE txt2img generation.

      This is a good example:

      original wiki image

      SDXL Controlnet image

      • keepthepace@slrpnk.net
        link
        fedilink
        arrow-up
        1
        ·
        10 months ago

        This is indeed a good example. You literally just give it squares to fill with whatever the model feels makes sense and as a result you have christian looking fresco probably very far from the likely reconstructions archaeologists would produce (the red parts were likely totally red, without characters in them).

        I feel that applied like that it is closer to misinformation than enhancement.

        • lanolinoil@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          10 months ago

          Well, not exactly again – The original image is passed to GPT4 vision along with the article title and the image’s description, so it’s attempting to be contextual in its prompts and contextual to the shapes in the original image.

            • lanolinoil@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              10 months ago

              I am pleased with it and its potential, especially for a v1. It’s OK that you don’t feel the same way.

              • keepthepace@slrpnk.net
                link
                fedilink
                arrow-up
                2
                ·
                10 months ago

                Ah sorry, I did not realize you were the author, I would have been less brutal. Then let me rephrase my criticism: I think it is unproductive to show reconstruction of statues or painting of archaeological interest. Maybe showcase it more on enhancement of black and ,white pictures or sci fi illustration ?

                • lanolinoil@lemmy.worldOP
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  10 months ago

                  all good thanks for the feedback! This will probably end up being the foundation for my weeb targeted saas app cash grab :P

                  Really fun tech project and supabase.com I’ve fallen in love with I think – best dev experience ever maybe