Monday, July 08, 2024

Have You Been Scraped? Uncovering AI's Training Data

In the age of generative AIart and large language models, the question of concern for any creative artist (I am loath to call them "content creators" but social media does): 

Has our work been used to train AI without our knowledge or consent? A new tool offers some answers and a way to take action.


The website haveibeentrained.com allows users to search through vast, publicly researched AI training datasets like LAION 5B using text prompts. Curious about my own digital footprint, I decided to give it a try.

https://haveibeentrained.com/


Searching my name yielded numerous images from other Norm Hansons, but among them was a familiar face—my own. A self-portrait rock painting I'd posted long ago as a profile picture on Artists at Large had made its way into the dataset. While not overly distressed by this single instance, it did give me pause.

More concerning was the discovery that my charcoal sketch of Sir John Monash, created for an exhibition in 2018, had been scraped from my website. This unauthorized use of my work felt like a violation of my artistic rights.


Fortunately, the website offers a small measure of control. For individual images, users can tick a box that adds the image to a "Do Not Train" register, signaling to participating groups that you don't want your work included in future neural network training sets. For broader protection, entire domains can be registered.

It's worth noting that these actions are somewhat akin to closing the stable door after the horse has bolted. The data has already been used in training existing models. However, it's currently our best option for protecting our work moving forward.

This situation highlights a critical need for transparency and ethical behavior from those creating large language models, whether for legitimate research, commercial interests, or other purposes. As AI continues to evolve, so too must our understanding of its implications for creative rights and data privacy.

Have you checked if your work has been used in AI training datasets? Share your experiences and thoughts in the comments below.

Saturday, July 06, 2024

Is Image Glazing Worth the Hassle?

Protecting our art is becoming crucial in today's AI-driven digital world. But is image glazing the answer? Here's my experience so far.

Cara, a popular alternate platform to Instagram at the moment, has yet to offer built-in glazing. They suggest using Glaze, a separate software. Sounds simple. Not quite.

Setting up Glaze is a bit of a headache:

- It's free, 👍 but requires downloading large zip files 👎

- You need a hefty amount of disk space 👎

- The process feels outdated and tedious 👎

took ~88 minutes to Glaze
The real kicker? It's slow.👎👎👎 We're talking almost an hour and a half per image for my 9x5 submissions on older hardware. Ouch.

tool ~83 minutes to Glaze

You can batch them 👍 but it's one at a time.👎

But here's the biggest issue: you can't tell if it worked 🤞. The image looks the same, and there's no way to verify if it's actually protecting your style.🤞

So, is it worth it? Excuse me if I'm a little sceptical. The process is time-consuming, and we're essentially trusting a black box. 

Can it really save our unique mark-making styles from AI theft?

Friday, July 05, 2024

Supporting My Art: Why Buy Me a Coffee?


In light of recent developments regarding social media ethics and content usage, I've decided to make some changes to how I share my artwork online.

Why These Changes?

I'm concerned about:
- Unauthorized use of my content
- AI systems scraping my artwork
- Maintaining control over my creative output

What's Changing:

- Reducing posts on social media platforms
- Limiting the publication of large versions of finished works
- Focusing on sharing here through my personal blog and website


How You Can Help

To support my continued work and the maintenance of my independent platforms, I've implemented a "Buy Me a Coffee" system. This allows you to:

- Make one-time contributions
- Show your appreciation without ongoing commitments
- Help offset the costs of hosting and creating content

Your support, even for just one virtual coffee, is greatly appreciated and helps fuel my artistic journey.


Buy Me A Coffee

Thank you for your understanding and continued support!

Tuesday, July 02, 2024

Protecting Creative Work in the Age of AI Scraping

As creatives in the digital age, we're facing a new challenge: how to protect our work from indiscriminate scraping by AI companies. While tools like Creative Commons licensing have been a go-to solution, their effectiveness against AI data collection is questionable.

Creative Commons: A False Sense of Security?

I've long relied on Creative Commons to share my work while maintaining some control. My license specifies attribution, non-commercial use, and (previously) share-alike terms. However, I'm beginning to question whether this offers real protection against AI scraping.

The Reality of AI Data Collection

Many companies, often hiding behind research organizations, are scraping vast amounts of online data to train AI models. This process often ignores licensing terms and lacks proper attribution or curation.


Changing Tactics

In response, I've updated my blog's license from "share-alike" to "no derivatives," hoping to prevent AI from copying my style. However, the legal landscape around this issue remains unclear, especially in Europe.

New Technological Defences

A promising development is the creation of tools that embed changes in image files. These alterations are invisible to humans but can disrupt AI training, potentially "poisoning" the dataset. Glaze and Nightshade are two such tools, though they're still in development and can be resource-intensive to use.

The Path Forward

Despite these efforts, I'm still uncertain about how to confidently share my work with those who behave ethically while protecting it from misuse. As creatives, we need to stay informed about these issues and continue seeking effective solutions to protect our work in the AI era.

What are your thoughts on protecting creative work in the age of AI? Have you found any effective strategies?


I've prepared this blog post with some good advice and a little rewording from Claude.AI

Wednesday, June 26, 2024

Have you checked your TOS lately

It would appear that most of the Social Networks and big Internet juggernauts have decided to change their terms of service (TOS). Interestingly most indicated the changes started on the 24th of June! I have no idea why this date matches up. Also the changes all pretty much incorporate similar terms, although written in obscure wording.

To paraphrase the essence is those services offering the ability to upload any information, store it or process it now let it be known that they are allowed to use this to suit their own purposes. Often they will add such as to improve their service. Similar conditions and wording have been in many TOS agreements for such services so they can handle our data, but their "clarifications" have muddied the water. A lot of users are starting to worry that this also means they can use your data in any way they like such as to train AI, modify and republish as they wish.

Needless to say, this has worried several people, particularly artists using such services as Adobe. Supposedly scores of users have tried to abandon Adobe subscriptions, then only to find it is very difficult, for some almost impossible, to leave without incurring significant fees. Similar problems and perceptions are affecting artists on Instagram so I’m not sure of the exact numbers but it seems a great rush that has apparently left Instagram for an alternative, Cara.

I actually do read the terms of service and must admit the changes are a bit hard to follow. However, I do agree they leave the way open for the services to claim ownership of anything you post on their particular offerings. In a few places, they do still say that you own copyright but I assume their lawyers just haven’t seen that clause yet.

Most of these services actually rely on us to supply the content that they then package up and use to convince advertisers to pay them money (often very big money). The problem is they “believe” we are happy having free access to their service, with the public exposure the world wide web can offer and not being paid for our content. Yes, these juggernauts have costs but their profits are larger, massive, even unimaginably disgustingly exorbitant!

So I decided some time ago not to share anything I intended to sell (like finished art work or photos) in a ready copyable form anywhere on the internet. I have still been publishing on the likes of Flickr, Blogger, Instagram and my own website, those things I’m working on, progress updates and just stuff I find interesting. This works for me because I’m not relying on the Internet to sell things my works.


Where does that leave us creatives and what can we do about it? 





I’m probably gonna leave social media to the bots for now and see what happens. Maybe play around a little in Cara.

Monday, June 24, 2024

The Wild West Side of AI

It seems as if a world wide web is adopting the ways of the Wild West, no laws so if you can see something worthwhile you take it, You don’t have to even use gunpoint these days you can just silently scrape it, make a copy which is ever so easy for digital information. Despite the fact that there are actually laws in place that should stop people doing this. The problem is copyright is complex and varies under different jurisdictions, whereas is the web goes everywhere. I think the fact that a lot of these services consider themselves platforms and not publishers is a very weak cop-out even if it may be a little bit legal it’s probably not moral. I cannot believe for instance that X and specifically Elon Musk supposedly champions free speech, letting very dubious characters pedalling hate speech, straight-out lies like spreadsing politically motivated fake news and fanciful log discredited conspiracy theories, and then at the same time fighting a government trying to take down the filming of a teenage terrorist stabbing a priest in the face while the teen posted his actions live online. I believe the Australian government's request to have it taken down was quite morally legitimate. Have they no shame, I guess not.

So don’t expect the big guys on the internet, or many others without a moral compass, to respect your work or loyal support. They will take what they can. Then throw you under the bus. However, I like the idea of sharing what I know and what I have created I just don’t want it reused without reference to me or straight out stolen. 

PS: Can you see the sunglass-wearing laughing face? Is it an example of the Intelligence of generative AI or just another example of when it hallucinates (aka gets it wrong)? Or is it our intelligence to recognise patterns and shapes (eg faces in clouds or bandanas)?

Tuesday, June 18, 2024

Enjoying the Warmth

When making a stitched panorama, into the sun it is important to keep the exposure constant, to avoid dramatic colour changes in a clear blue sky. I've left in the lens flare artefacts, they add to the "sparkle"