Users desire a mechanism to clearly identify content generated by AI, including visual watermarks for fake images and flags for GPT-like text, to distinguish between AI-made and human-made content.
I keep reading posts and articles about pressing "pause" on AI developments. First, a moratorium rarely works. There is no way of checking that people are not building the next-gen LLM. Second, rather than focusing exclusively on the dangers, people should identify features they would like for this tech to offer. It would help academia, industry and regulators focus on something concrete. So, here is a wish list I am compiling. I am not saying everything below is doable, but I would claim everything is desirable. Generative AI should be marked as such (“AI made” vs “human made”). Fake images should have a fingerprint which says “this is fake”. GPT-like content should be flagged as “AI generated”. Generative AI should be green and not consume energy foolishly (think Bitcoin mining farms). This means reusing LLMs and datasets. This means finding better ways to train and do inference. Generative AI should be explainable, which in the case of LLMs means pointing to references where the info was learned. Generative AI should respect copyrighted materials and provide their nutritions facts (which datasets were used). The Bloomberg GPT paper does a good job. Maybe robots.txt should have a specific directive to ban LLM-scrapers. Generative AI should provide a way to fix errors. Search engines had the “right to be forgotten”. We need something like that. There should be benchmarks to evaluate Generative AI in terms of correctness, bias, etc. There should be best practices to produce LLM-friendly content (think SEO, but for LLMs, aka LLM-O), including documentation, APIs, etc. Please add your own.