Twitter has posted an update on how its moderation approach called “Unreached Free Speech” is working, and according to the company, it has seen some encouraging results. In April, the website began limiting the reach of tweets that violated its hateful conduct policy and applied a label that read: “Limited visibility: This tweet may violate Twitter’s rules against hateful conduct.” Apparently, Twitter has applied the tag to more than 700,000 posts since then and has proactively prevented ads from appearing next to that content.

The company also claimed that the hashtag reduces a post’s reach by 81 percent, effectively limiting the visibility of posts that potentially exhibit hateful conduct. Besides, Twitter revealed in its update that more than a third of users choose to delete tagged tweets themselves once they’ve been notified that they violated website policy, and only four percent of authors have appealed the tags.

The fact that the company charges for API access means that most researchers studying hate speech cannot independently verify these claims. But Twitter you are clearly stating that your approach has been effective so far. In fact, the website is going ahead with its plan to expand its tags to include more types of policy violations.

According to their announcement, they will now also tag and downrank posts that violate their abusive behavior and violent speech policies. The tweets to be tagged in the coming weeks will include posts with malicious content directed at individuals, those that encourage others to harass a person or group of people, those that threaten to inflict physical harm on others, and those that tweets that encourage others to commit acts of violence or harm.

