WordPress

Automattic Faces Scrutiny Over AI Access Policy – WP Tavern

This text is a joint effort by James Giroux & Jyolsna.

After unconfirmed reports of Google coming into right into a content material licensing settlement with Reddit for coaching its AI, 404 Media claimed yesterday that Automattic is set to sell Tumblr and WordPress.com users’ content to Midjourney and OpenAI. If true, this might mirror an extended partnership that Shutterstock entered into with OpenAI final yr.

Claims of 404 Media 

404 Media claims insider details about the deal–backed up with documentation–confirming Automattic is within the superior phases of negotiation with these AI firms. To validate its claims 404 Media quoted Tumblr Product Manager Cyle Gage as he reported on an inner message board, the standing of the preliminary knowledge assortment course of and the way it included content material that ought to not have been collected.

Whereas 404 Media has supplied quotes from an inner supply, it has not supplied any particular proof comparable to screenshots of conversations or entry to supply supplies to help others in validating their claims. 404 Media additionally refers to person content material as “customers’ knowledge” which may simply be misconstrued as personally identifiable info (PII) or bank card info. Whereas the content material being mentioned within the article is content material that’s already publicly accessible.

Response From Automattic 

Inside a couple of hours of 404 Media’s article going up, Automattic launched a press release describing its place on content material distribution and the rights of all customers on WordPress.com and Tumblr to choose out of their public content material being included in knowledge shared with AI companions.

Automattic makes the argument that AI regulation and laws don’t but exist and, as such, is taking these steps to proactively present customers with extra strategies of controlling how and the place their content material is made accessible. They’re making a pathway for AI companions to get streamlined entry to the content material customers are open to sharing whereas additionally taking steps to take away entry to content material that customers not wish to be shared. In different phrases, the content material in query is already accessible to the AI firms because it’s publicly crawlable and content material offers solely make it extra accessible and manageable. 

Automattic printed “Protecting User Choice” emphasizing the next factors:

  • We at present block, by default, main AI platform crawlers—together with ones from the most important tech firms—and replace our lists as new ones launch.
  • We have now a setting to discourage search engines like google from indexing a website on WordPress.com and Tumblr. This indicators to search engines like google to not crawl that content material or embrace it in search outcomes.
  • We have now added comparable settings to WordPress.com and Tumblr to discourage crawling by AI firms. Should you already discourage search engine indexing, that is routinely enabled.
  • We’ll share solely public content material that’s hosted on WordPress.com and Tumblr from websites that haven’t opted out.

We’ll share solely public content material that’s hosted on WordPress.com and Tumblr from websites that haven’t opted out.

The article continues hinting at a deal sooner or later: “We’re additionally working straight with choose AI firms so long as their plans align with what our group cares about: attribution, opt-outs, and management. Our partnerships will respect all opt-out settings. We additionally plan to take {that a} step additional and usually replace any companions about individuals who newly choose out and ask that their content material be faraway from previous sources and future coaching.” 

Automattic additionally launched a new tool that “enables you to choose out of sharing content material out of your public blogs with third events, together with AI platforms that use such content material for coaching fashions. We’ll have interaction with AI firms that we are able to have productive relationships with, and are working to provide you a simple method to management entry to your content material…We already discourage AI crawlers from gathering content material from WordPress.com and can proceed to take action, save for these with which we accomplice… We’re dedicated to creating positive our companions respect these selections.”

WordPress.org Customers Aren’t Affected

Josepha Haden Chomphosy, Govt Director of WordPress shared this with the group within the Slack channel: “I can verify that the WordPress venture shouldn’t be concerned in promoting person knowledge or content material for AI coaching functions. This has been our constant stance throughout the lengthy historical past of WordPress, whilst just lately as after I was sharing thoughts for the future of our project heading into 2023.”

Later, Jetpack tweeted that “knowledge from Jetpack linked websites shouldn’t be included. This solely applies to WordPress.com hosted websites.”

Apparently, Automattic has been struggling to make Tumblr worthwhile after buying it in 2019. Final yr Matt revealed that Tumblr is losing $30M every year.

We have now reached out to Chenda Ngak (Head of Communications at Automattic) and can replace this text as soon as we get her quote.

(WordPress (or WordPress.org) is an open-source CMS whereas WordPress.com is a hosted platform owned by Automattic, an organization based by Matt Mullenweg. Each aren’t the identical.)



Leave a Reply

Your email address will not be published. Required fields are marked *