Standards to indicate permissions for internet scraping for AI training
As a software engineer, I support ai.txt as a simple, respectful way
to express AI usage preferences. For wide adoption, it must be easy to
implement, enforceable, and fine-grained to allow creators to distinguish
between indexing, caching, and training use cases. I also recommend
establishing a shared standard registry for AI user-agents and encouraging
bots to self-identify with verifiable headers.
to express AI usage preferences. For wide adoption, it must be easy to
implement, enforceable, and fine-grained to allow creators to distinguish
between indexing, caching, and training use cases. I also recommend
establishing a shared standard registry for AI user-agents and encouraging
bots to self-identify with verifiable headers.
Aaron Muuo.
On Thu, May 15, 2025 at 8:57 PM Benson Muite via KICTANet <
[email protected]> wrote:
> There is an IETF process underway to standardize a machine readable way to
> indicate permissions associated with how online content can be used for AI
> training. The aim is to find an ai.txt specification analogous to the
> robot.txt specification put on many websites. For more information see:
> datatracker.ietf.org/group/aicontrolws/materials/
>
> Feedback is sought.
>
> Benson
>