wandering in the light

Tuesday, August 20, 2024

Using robot.txt

For decades, website owners could control how search engine crawlers and other software bots interact with their sites. This powerful tool, known as robots.txt, has been an integral part of HTML since its early days. Let's dive into what robots.txt is and how you can use it effectively.

What is robots.txt? It is a simple text file placed in the root directory of your website. It provides instructions to web crawlers about which parts of your site they're allowed to access and index.

How does it work?

The file typically contains at least two types of instructions:

1. User-agent: Identifies the crawler being addressed
2. Disallow: Specifies which parts of the site the crawler shouldn't access

A basic example might look like this:

User-agent: *
Disallow: /

The downside of this approach is that Google’s will not Index your website and therefore it won’t be found in any Google searches. Probably not what we want. An alternative is to identify a folder on your website (if you have folders). In this case my images are all stored in the subdiretory /images.

Disallow: /images/

It is easy to make your robot text file much more specific. For example I don’t want chat gtp’s web scraper to go through my website you can stop so I need to know the agent name that OpenAi are using and I got this information from 20i article How to stop AI scraping your website (see see video below)

User-agent: GPTBot
Disallow: /

At this point in time I not sure this approach is reliable because I suspect that a lot of Bots that are scraping the Internet are just ignoring robot text. it’s clearly a voluntary exercise. However it is one step that demonstrates you want to be opted out of the datasets collected by specific AI developer, which may come in handy in future.

A more regulated way to opt-out os such indiscriminate content scrapping would be nice?

Friday, August 16, 2024

Is AI getting “good enough”?

In two minds about using AI for art works

This is an AI image I created with the prompt above, using google Deep Mind’s latest incarnation of its generative app Imogen-3. it really looks better than most other generative AI. Less overcrowding with intricate but less relevant detail, with consistent lighting and better capturing the emotional direction of my prompt. It was also my only creation with this prompt.

Google claims

"We’ve significantly improved Imagen 3’s ability to understand prompts, which helps the models generate a wide range of visual styles and capture small details from longer prompts.
To be even more useful, Imagen 3 will be available in multiple versions, each optimized for different types of tasks, from generating quick sketches to high-resolution images."

Ok that’s nice wording google but what does it mean and why do I see such a difference to other prompt generated images

Well perhaps there is a hint right at the end of their hype.

"Imagen 3 was built with our latest safety and responsibility innovations, from data and model development to production.
We used extensive filtering and data labeling to minimize harmful content in datasets and reduced the likelihood of harmful outputs. We also conducted red teaming and evaluations on topics including fairness, bias and content safety.”

Recent developments in AI technology have raised some intriguing questions about data handling, promoting fake news and bias. It appears that at least google is implementing input checking mechanisms on the information they use to train their models. This likely extends to assessing image quality as well, ensuring that the data fed into these systems meets certain standards. Looks “good enough”.

However, this observation leads to a more pressing personal concern:

Does responsible AI development truly encompass ethical practices across the board? The issue of data collection methods remains a significant point of contention. Are these companies indiscriminately scraping data from various sources without proper consent or consideration?

This brings me to a personal worry many of us share: the privacy of our own data, particularly our photos. With the prevalence of cloud-based photo storage services like Google Photos, and the vast number of images captured and uploaded from mobile devices daily, it's natural to wonder about the security and usage of this data. I worry some companies are "hoovering up" these personal images en masse? If so, what are the implications for our privacy and the control we have over our own digital footprint?

As consumers and digital citizens, it's important that we stay informed about these practices and advocate for ethical standards in AI development. The balance between technological advancement and personal privacy is delicate, and it's a conversation we need to keep having as AI continues to evolve and integrate into our daily lives.

Monday, July 22, 2024

My Most Portable Sketching Setup

One of the primary objectives with my art gear that I took to Queensland was figuring out the minimum I required when painting or sketching. A significant oversight was that I forgot to pack a small sketchbook, so I had to purchase an A5 Visual Diary at the airport. I managed to find one with 110gsm Cartridge paper which I figured would be okay for pencil and pen, but maybe a bit thin for watercolour.

It was just the right size, combined with my dot chart, a water brush, and sometimes a gel pen. I actually did a lot of sketching with this. Anyone else who has tried sketching people at the beach knows that no sooner do you start the sketch than the people move away, stand up, change positions, or go home. Capturing people on the beach can be a fun challenge, but you probably only have a short time to get the basic gestures and textures.

What I found is that these sketches could capture a lot more life in the figures than a photograph. If I was careful how much water I used with the water brush, I had no problems with wrinkling the page as I was painting, so decent-weight cartridge paper is probably okay for this size sketchbook if you're just using it to collect some gestural pictures quickly.

I really appreciate the dot chart on the CD laid out like a colour wheel. On this particular dot chart, I may have overdone the number of colour dots. It's easy to mix the colours you want onto the paper as you go, and I really didn't need that many colours - going back to the 6 basic families is probably okay, possibly with one or two darks and an earth colour.

Creating the dot chart like this is both practical and very cheap and won't discourage me from experimenting. So I've already started making up a few, -CD colour wheel dot charts to start very minimal sketching kits based on a water brush and a small collection of pencils and/or pens.I have a few water brushes, but I really liked the Pentel waterbrush with the bulbous water barrel - it's easy to load with water, always has a decent flow, and the bristles are easy to clean. On the trip I also bought a cheap set of water-soluble graphite pencils in two grey shades and black, and I liked them. Perhaps I'll look out for more upmarket suppliers, like Derwent.. I also like gel pens, particularly the ones with soluble ink.I'm okay with just using the cheaper spiral-bound A5 visual diaries, as long as they have thicker pages (e.g., 120 gsm) rated for mixed media, which should mean they can take a little water media if I want colour.. My intention is to have several of these kits easily accessible everywhere so there is never an excuse not to stop and sketch - e.g., in my studio, office, car, art bag, and perhaps even in the back pocket of my camera bag.

Plenty of fun to come!

Tuesday, July 16, 2024

The Dilemma of "Bully" Colours in Watercolour Palettes

As I mapped my watercolour pigments onto a hexagon grid, seeking to merge colour systems from RGB and CMYk to the likes of Ostwald and Munsell, I noticed an interesting pattern. The warmer colours (yellows, oranges, reds) spread out more on the grid's edge, while cooler colours, blues and greens and even deeper reds and purples on the outer sections seemed to merge in tone and hue.I wondered if this was because I had picked the darker richly pigmented “bully” colours? Those strong staining colours that look so overwhelming straight out of the tube.

This observation led me to two questions:

1. Do I have enough variety in my blues, deep reds, purples, and magentas?

2. Should I retain strong, staining "bully" colours in my basic palette?

To address the first question, I experimented with new colours during a trip to Queensland. I bought some extra colours shown here that I added to my normal palette using little half pans stuck in with bluetak. I was particularly chasing tropical colours and I also wanted at the same time to try out a couple of MaimeriBlu colours. My choices were good. This exploration will be covered in a separate post.

The second question about "bully" colours like phthalo green and blue is more complex. These intense pigments can be powerful in mixes but require careful handling.

They offer several benefits:

1. Desaturating complementary colours while maintaining harmony

2. Creating strong contrasts when used sparingly with analogous colours

3. Adding warmth (Quinacridones) or coolness (Phthalos) to mixes

4. Achieving split-tone effects, like warm highlights with cool shadows

Instead of discarding these pigments, I'm considering using them as mixing aids rather than relying on convenience colours like naple's yellow or sap green or any of those gorgeous pigments you may have impulse purchased but seldom used. This approach might help create a more nuanced and harmonious palette.

Monday, July 08, 2024

Have You Been Scraped? Uncovering AI's Training Data

In the age of generative AIart and large language models, the question of concern for any creative artist (I am loath to call them "content creators" but social media does):

Has our work been used to train AI without our knowledge or consent? A new tool offers some answers and a way to take action.

The website haveibeentrained.com allows users to search through vast, publicly researched AI training datasets like LAION 5B using text prompts. Curious about my own digital footprint, I decided to give it a try.

https://haveibeentrained.com/

Searching my name yielded numerous images from other Norm Hansons, but among them was a familiar face—my own. A self-portrait rock painting I'd posted long ago as a profile picture on Artists at Large had made its way into the dataset. While not overly distressed by this single instance, it did give me pause.

More concerning was the discovery that my charcoal sketch of Sir John Monash, created for an exhibition in 2018, had been scraped from my website. This unauthorized use of my work felt like a violation of my artistic rights.

Fortunately, the website offers a small measure of control. For individual images, users can tick a box that adds the image to a "Do Not Train" register, signaling to participating groups that you don't want your work included in future neural network training sets. For broader protection, entire domains can be registered.

It's worth noting that these actions are somewhat akin to closing the stable door after the horse has bolted. The data has already been used in training existing models. However, it's currently our best option for protecting our work moving forward.

This situation highlights a critical need for transparency and ethical behavior from those creating large language models, whether for legitimate research, commercial interests, or other purposes. As AI continues to evolve, so too must our understanding of its implications for creative rights and data privacy.

Have you checked if your work has been used in AI training datasets? Share your experiences and thoughts in the comments below.

Saturday, July 06, 2024

Is Image Glazing Worth the Hassle?

Protecting our art is becoming crucial in today's AI-driven digital world. But is image glazing the answer? Here's my experience so far.

Cara, a popular alternate platform to Instagram at the moment, has yet to offer built-in glazing. They suggest using Glaze, a separate software. Sounds simple. Not quite.

Setting up Glaze is a bit of a headache:

- It's free, 👍 but requires downloading large zip files 👎

- You need a hefty amount of disk space 👎

- The process feels outdated and tedious 👎

took ~88 minutes to Glaze

The real kicker? It's slow.👎👎👎 We're talking almost an hour and a half per image for my 9x5 submissions on older hardware. Ouch.

tool ~83 minutes to Glaze

You can batch them 👍 but it's one at a time.👎

But here's the biggest issue: you can't tell if it worked 🤞. The image looks the same, and there's no way to verify if it's actually protecting your style.🤞

So, is it worth it? Excuse me if I'm a little sceptical. The process is time-consuming, and we're essentially trusting a black box.

Can it really save our unique mark-making styles from AI theft?

Friday, July 05, 2024

Supporting My Art: Why Buy Me a Coffee?

In light of recent developments regarding social media ethics and content usage, I've decided to make some changes to how I share my artwork online.

Why These Changes?

I'm concerned about:
- Unauthorized use of my content
- AI systems scraping my artwork
- Maintaining control over my creative output

What's Changing:

- Reducing posts on social media platforms
- Limiting the publication of large versions of finished works
- Focusing on sharing here through my personal blog and website

How You Can Help

To support my continued work and the maintenance of my independent platforms, I've implemented a "Buy Me a Coffee" system. This allows you to:

- Make one-time contributions
- Show your appreciation without ongoing commitments
- Help offset the costs of hosting and creating content

Your support, even for just one virtual coffee, is greatly appreciated and helps fuel my artistic journey.

Thank you for your understanding and continued support!